HTTP


HTTP (Hyper Text Transfer Protocol) is a protocol that is used for transmitting data in networks. HTTP is a generally accepted technical standard that defines how a web client communicates with a server so that the data requested by the client can be loaded and displayed.

General information

Next to URL and HTML, HTTP is one of the most important concepts of the Internet (www). It was developed in the early nineties and exists now in its current version 1.1, which is recorded in the standard RFC2616 [1]

The protocol primarily defines two different types of messages, the request from the client and the response from the server to the client. Each of these messages consists of two parts, the HTTP header and the HTTP body. The header contains metadata about the message body such as the character set used (encoding) and content type (for example, HTML document). The HTTP body contains the data that is to be displayed later at the client.

The TCP/IP protocol is used to transfer this data reliably between server and client. It acts as a transport layer from host to host. The data on the Internet is structured according to a seven-layer model (OSI model), whereby the HTTP protocol is used only in the last three layers (application, presentation, and session).

Functions

When a client sends a TCP request for a document from an Internet address (URL), the server will reply with a response that always includes an HTTP status code. The status code provides information about whether the request was successful or unsuccessful and sends a three-digit code in the HTTP header to the client. If a requested resource was not found, error code 404 may be output, for example.

However, the HTTP protocol is not only used for the transmission of HTML resources, it can also be used to transport other data formats when they are integrated via interfaces. The resource does not even have to be on the server. Dynamic web pages can be created with PHP or ASP.NET when called by the client. Since HTTP is an object-oriented open protocol, other types of data can be easily implemented. But HTTP is a stateless protocol. That means that no sessions or session ID’s will be stored as part of the communication between client and server. This happens only in the web client or browser of the user, for example, with an HTTP cookie.

The key functions of the HTTP protocol are the request procedures. They regulate the transmission of the actual data. Using HTTP GET, the most commonly used method, files can be downloaded from the server, for example. By sending a URI (Uniform Resource Identifier), which is a single identifier, to the server, it will be able to identify what resource it has to return. From the user’s perspective this is done by accessing a link or URL. The server or host will then return the requested document along with a status code to the web client.

In principle, all computers or networks involved, can follow the communication between client and server which takes place over HTTP. Therefore, HTTPS was developed. A protocol that enables encryption and authentication to protect transmitted data against third party access.

Relevance to search engine optimization

The HTTP protocol and related technical details are important for search engine optimization because the crawlers of search engines rely on HTTP to access websites. A crawler acts as a user agent or web client when searching the web for websites and content, and initially sends a request to the server using request procedures.

Depending on the server model, webmasters have different options available to configure the server so that the crawler will have access. Basic settings can be made in the Robots.txt file and a .htaccess file that is stored on the server can further specify these settings. Even server-side modules like mod-rewrite in Apache servers can be used together with the .htaccess file to make appropriate settings.

In particular, the issuance of fault codes must be avoided since search engines cannot crawl these pages and will accordingly give it a bad rating. In the case of dynamic URLs, it would be recommendable to rewrite them into static URLs. This can be accomplished through redirects in htaccess or circumlocutions like mod-rewrite in Apache servers. Permanent redirects (301) should be chosen in order to pass on link popularity.[2]

References

  1. Hypertext Transfer Protocol -- HTTP/1.1. ietf.org. Accessed on 12/10/2013
  2. Redirection. Moz.com. Accessed on 12/10/2013

Web Links