Content Type


Content type refers to content of a file which is transferred via HTTP and classified according to a two-part structure. This classification is standardized and published by the IANA. An alternate term is MIME type. In this case, the data content is specified according to the MIME (Multipurpose Internet Mail Extensions). Both are summarized as “Internet media type.” Content type is also a meta tag in the header of an HTML document that can show browsers what content is being used on that specific web page.

MIME content types

There is a wide variety of content types, which are also referred to as MIME types. The MIME standard was introduced specifically for emails according to RC 7233.[1]


Indicating the internet media type provides two specifications and optionally an addition. The former includes the media type and the latter the subcategory. The following types of media can be used:

  • Application: These are files that can be used by a particular application or files for which no unique assignment is possible.
  • Audio: This refers to the audio files contained in a document.
  • Example: If a document is marked up with this, it contains examples of specific file formats.
  • Image: This indicates images and graphics files.
  • Message: This specifies messages such as email.
  • Model: Designates data with multidimensional structure.
  • Multipart: This type designates files that consist of several parts.
  • Text: This is the Internet media type for text files.
  • Video: This is used for video files.

Common combinations with subcategories are:[2]

  • Image/jpeg: JPEG image file
  • Image/tiff: TIFF image file
  • Text/plain: TXT file (Plain Text)
  • Video/mpeg: MP2, MPA, MPE, MPEG, MPG files
  • Audio/mpeg: MP3 files
  • Audio/x-wav: wav files

If an HTML document is being classified, the character set can also be supplemented. A possible specification would be for example: text/html; charset=UTF-8

Content-Type as a meta tag

The content type meta tag is defined in the header of a webpage to display the standard character set and the type of content being used on an HTML page.

Benefits

By defining the content type and in particular “charset” (character set), it can be ensured that any browser will be able to display the page properly. If this specification is not inserted in the header of a page, browsers may not be able to display umlauts ä, ö and ü or such characters correctly. Previously you would often see pages where the umlauts were alternately replaced by varying placeholders.

Special punctuation can cause problems as well. By specifying the content type, the character set to be used will be defined in accordance with the ISO standard. When a browser later accesses the page, it will recognize from this specification which character set is to be used. That way the correct interpretation of all characters will be ensured.

Example for integration

The Content Type meta tag looks like this:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">

The Western European and American character set, ISO-8859-1, applies for German language pages.

Other ISO standards for foreign language sites

If websites are published in other languages, then different standards apply:

ISO 8859-1: Albanian, Danish, German, English, Faroese, Finnish, French, Galician, Icelandic, Irish, Italian,
Catalan, Dutch, Norwegian, Portuguese, Spanish, Swedish.
 
ISO 8859-2: Croatian, Polish, Romanian, Slovak, Slovenian, Czech, Hungarian.
 
ISO 8859-3: Esperanto, Galician, Maltese, Turkish (Turkish s also ISO 8859-9.).
 
ISO 8859-4: Estonian, Latvian, Lithuanian.
 
ISO 8859-5: Bulgarian, Macedonian, Russian, Serbian, Ukrainian.
 
ISO 8859-6: Arabic.
 
ISO 8859-7: Modern Greek.
 
ISO 8859-8: Hebrew.
 
ISO 8859-9: Turkish.
 
ISO 8859-10: Greenlandic (Inuit), Sami (Sami)

Relevance to search engine optimization

By defining meta tags you provide important information search engines. Therefore, it is recommended to use content-type metatags. This specification is one of the tags that can be easily read by the Google search engine. If this tag is set, the Googlebot can classify the crawled content precisely in advance. At the same time, allocation to vertical search, such as picture or video search is made easier.

The content-type is important for language assignment as well. If the Western European character set is defined for a German website, Google finds out automatically that the umlauts ä, ö and ü should be equated with ae, oe and ue. If a user searches, for example, for “Linkpopularitaet” (Engl.: link popularity), then Google can output search results that contain the word “Linkpopularitaet.”

References

  1. RFC 7233 ietf.org Accessed on October 22, 2018.
  2. MIME Types by Content Type About.com. Accessed on 04/01/2014

Web Links