HTML Special Characters


HTML special characters are part of a character set, a coding, which goes beyond the available characters that can be recorded with a regular keyboard. They include Greek letters, mathematical symbols, arrows, currencies, dingbats (ornaments), and graphics, as well as checkmarks and symbols for copyrights and trademarks. HTML special characters are also referred to as masked characters and HTML entities.

HTMLSpecialCharacters.png

General information

HTML is bound to different rules and definitions as a text-based markup language that enables the display of HTML documents on web browsers. One rule is the necessary specification of the character set. These are the available characters in the markup language. A document is encoded and displayed based on a specific character set.

However, since HTML documents are edited using traditional keyboards, it is possible that character combinations on the keyboard are used to create special symbols. HTML special characters are to be understood as a definition of these combinations. They are the references between the character set and the webmaster who wants to write certain characters, to ensure proper representation of the special characters. One method of translation is, for example, URL encoding.

How HTML special characters are used

A character set can be specified with hexadecimal, decimal, and HTML entities (HTML 5) notation. If special characters are written in one of these notations, the client (browser) generates a corresponding special characters if it can read the font or load the font from the character set.

The character set is specified in the meta data of the document, which is transmitted from the server to the client with the first byte.

  • Since HTML 4:
 
  • Since HTML 5:
 

Nowadays all characters from the UTF-8 character set can be used and the browser will encode the document in accordance with this character set.

Examples of HTML special characters

If you want to use special characters on your website, you can refer to a list for assistance. It contains characters that can be listed in three variations. The simplest and most advanced is the mnemonic notation in HTML entities such as © for copyright.

  • ©: The copyright symbol can be generated with the following characters (without spaces):
    • & # x a 9 ;
    • & # 1 6 9 ;
    • & c o p y ;
  • ®: A registered trademark may be noted this way (without spaces):
    • & # x a d ;
    • & # 1 7 4 ;
    • & r e g ;
  • → The arrow to the right (without spaces):
    • & # x 2 1 9 2 ;
    • & # 8 5 9 4 ;
    • & r a r r ;

Relevance to search engine optimization

Special characters in HTML have always had implications for search engine optimization. On the one hand, many browsers were not able to interpret some characters, so that users could not read them. The result: poor usability affects search engine optimization indirectly, because users may leave the website. On the other hand, search engines could not interpret some special characters properly and therefore the crawler could not read the content.

This changed fundamentally with HTML 4 and latest with HTML 5. Although some browsers still may represent characters incorrectly in some cases, the crawlers of search engines read the code correctly to then display it in the SERPs. Google converts all websites into UTF-8 before they are read from the index database server and displayed to the user on result lists.

Special characters are nowadays used to lure users to websites and increase the click-through rate. This also applies to the content of HTML files and meta description and title tags.

The use of HTML special characters can also make certain connotations of strings clearer. For example, special characters are used to identify telephone numbers with a symbol or mark individual statements symbolically. With special characters, character strings get a graphical level that indicates the importance of the character string. This is in addition to rich snippets and structured data, another albeit small step towards the semantic web. A consequence of the use of too many special characters is the increased loading time of the site.[1]

References

  1. Îñţérñåţîöñåļîžåţîöñ googlewebmastercentral.blogspot.de. Accessed on 07/23/2014