Content Delivery Network

A content delivery network (CDN) is a system of distributed servers (network) that deliver webpages and other web content to a user, based on the geographic location of the user. IT works on the principle that the closer the content of the web hosting provider is to a user, the faster it will be accessible to that person. The response time of the nearby server is shortened, and the content can be downloaded quickly. CDNs are primarily used for streaming audio and video content, but also for websites with particularly wide coverage to optimize the load distribution of servers and improve the user experience.

General information

A content delivery network (CDN) is meant to have the data more rapidly available than conventional hosting. The server-client principle, which is the basis of any Internet application, is improved so that not just one server holds the data, but a composite of servers. It originates with a root server or node, which distributes the data to other servers. The path of the data is now dependent on the geographical location of the user, because the closer the data is, the faster the retrieval.

The root server mirrors the content to a system of servers distributed across the globe. These react much more quickly to a request, since the path from the server to the user is shorter. Although telecommunications data is delivered quite fast these days, the equation of speed equals distance through time applies here as well. In the IT sector this refers to load time. CDN systems also minimize server timeouts, incorrect data packets and jitters, while the bandwidth of the Internet connection is used effectively. Jitters are interference of digital signals, which occur, for example, when telephoning with VOIP and the connection gets disrupted. CDNs optimize data traffic between server and client. They also ensure a better user experience by reducing wait and load times.

600x400-CDN-01.png

How it works

A CDN does not necessarily have a system of physical servers. It is often implemented by connecting and organizing different servers to each other. This is done by cache and memory systems which are linked by routers. Typically, CDN management software is used to identify the nearest server and then deliver the content. Since a copy of the content is stored on all servers in the system, the data with the least latency to the user is forwarded.

First, the location where a query came from is established and then the most appropriate server is determined. That server responds to the HTTP request and sends a response in the form of an HTTP status code; only then does the actual content follow in the form of an HTML document. An average request contains approximately 20 such question-answer sequences and does not take more than 3 seconds on average. A CDN is meant to shorten that amount of time and thereby improve performance.

There are different types and sizes of CDNs, which are used in small businesses, SMEs, and at the level of large corporations. CDNs are divided into three categories based on size:

  • Edge distribution: Used in small CDNs. Data is transferred from a root server to the edge devices (edge nodes of a network) or directly to the point of presence (POP, user’s Internet access device) to be able to retrieve it rapidly.
  • Edge hierarchy (border hierarchy): Is used for medium-sized CDNs. Hub caches are installed downstream from the root server which distribute the data from a request to the nearby servers.
  • Hub and spoke (hub and spoke): Is only used for large networks. The data of the root server is mirrored on all connected servers, called hubs. A cache system (the spoke) makes the data readily available on all servers when it is requested.

Practical relevance

In practice, there are numerous providers of content delivery networks. Selection of an appropriate agent will depend on the requirements. A global company has different requirements than an online store. If it is a commercial, sales-oriented website, the accessibility and rapid loading of content is an important criterion because users leave quickly if the offer is not loaded within seconds. Large companies that are active internationally, also put importance on good accessibility and performance. In particular, if a site is heavily loaded. CDNs cache such access loads and ensure that for online stores the site is always accessible and loads fast. The same applies for possible DoS attacks because CDNs regulate the load distribution on the server system. If a hacker attacks, the probability that the system collapses is reduced with CDNs.[1]

A selection of CDN solution providers:

  • Akamai
  • CloudFlare
  • Rackspace
  • Amazon CloudFront
  • Edgecast
  • Microsoft Azure
  • KeyCDN
  • Limelight

MirrorBrain, OSSCDN, or CoralCDN also have open source solutions for various systems.

Importance SEO

Performance plays a prominent role in the SEO industry. Access, availability, and load time are relevant for search engine optimization. Although it depends on the purpose of a website, it is generally agreed that these criteria are a factor in the Google ranking. Google recommends to constantly check the load time, also called page speed, and optimize if necessary. There are other programs in addition to the webmaster tools that can be used for speed testing. Registering in the webmaster tools of each search engine is recommendable, even if CDNs are used.

However, the use of content delivery networks has other consequences that should be observed from the beginning.[2] The IP, domain name, and URL may change due to the CDN provider. Because IP addresses are always tied to geographic locations, an anycast IP is useful, which then applies to the CDN. Most CDNs use anycast IP addresses for routing.[3] Instead of a domain name with the seller name, a subdomain is recommended for the CDN system (such as cdn.sample.com). This can be defined by setting the DNS CNAME; the CNAME then refers to a canonical URL with your own domain and not the name of the provider. Some service providers offer such a set-up at the beginning.

If all data is hosted on the CDN, it is also useful to separately label canonical URLs or decide which data should be loaded from the CDN. For example, many website owners leave images that belong to the content of the website on their own server in order not to suffer loss in ranking. By using the rel=canonical tag and the absolute address, duplicate content issues of the data can be avoided if all data is hosted in the CDN. Moreover, the CDN should not choose its own file name and thus change the directory structure, but the same conventions that were previously for the domain in question should be maintained. Sometimes redirects must be determined, so that error codes and non-transmitted data packets can be avoided.

These technical peculiarities have direct consequences for search engine optimization of websites. If you do not have an in-depth knowledge of programming and web architecture, either a competent service provider or agency should be tasked with the care of moving to a CDN in compliance with SEO criteria.

References

  1. How Content Delivery Networks (CDNs) Can Impact SEO searchenginejournal.com. Accessed on 01/25/2016
  2. Four SEO Best Practices for Using a Content Delivery Network (CDN) goinflow.com. Accessed on 01/25/2016
  3. How Anycast IP Routing Is Used at MaxCDN maxcdn.com. Accessed on 01/25/2016

Web Links

Category