Random Surfer Model


The random surfer model provides a basis for the PageRank algorithm and calculating an appropriate score for each page, to counter any chances that all links contribute to a page's authority signals. The model aims to best represent the behavior of website visitors and calculates a probability of a random user visiting a webpage. The score is affected by several criteria such as the placement of a link (i.e. how visible it is), how many other links are on the page, and more. Whereas the usage in Google's ranking algorithms in 2021 is unclear, it reminds us to only link to relevant and useful sources of information that would appeal to website visitors.

Surfer behavior

A surfer moves through the Internet in two ways. They may enter a URL or use a bookmark to go directly to a webpage. Or they may follow a series of successive links until again accessing a new page. In a random surfer model, it is assumed that the link which is clicked next is selected at random. The content does not matter. Moreover, it is assumed that clicking another link is not an infinite chain, but that a random user will lose interest in following those links at a certain point and visit a new website instead.

Probabilities

The likelihood with which a surfer is on a particular page can be deduced from the PageRank. The probability with which they will then follow a particular link depends only on the number of existing links. Thus, the probability of a surfer visiting a page is the sum of all probabilities with which he visited inbound links from this page. Therefore, webpages that are often linked, are also visited frequently and have a high PageRank. This value is still reduced by a factor d.

The reason is that a random surfer will not indefinitely always follows a link, but after a certain time automatically accesses another page. Depending on how large the probability is that the surfer who is following links does not break off, d has a value between 1 and 0. The closer that value is to 1, the more likely it is that the link will be followed. The probability that a surfer visits a random new page will be calculated with the constant 1-d.

Reality

In reality, a user has an objective and therefore does not move randomly across links through the net. They will only click a link if they expect to get closer to their goal based on the content of the requested page. Content plays a crucial role. Thus, the random surfer model no longer reflects today’s reality. Regardless, it is a model that can describe a random user better and should contribute to measure the importance of a website.

At that time the PageRank played an important role for Internet users and for SEOs. It could give an indication of how legitimate or valuable a website is. But since this model was mostly based on the strength of the incoming links and the content of the landing page was ignored, it is no longer appropriate today. It was possible that a website with meager content received a PageRank of 6, simply because another website with PageRank of 7 was linked to it. This fact may also be one of the reasons why Google now no longer updates the PageRank and no longer uses it as an indicator of the quality of a website. Ultimately, PageRank was intended for Internet users. Google works with its own scoring system that measures the quality of websites.

Relevance to search engine optimization

Indirectly, the random surfer model was important for SEOs for quite some time because it helped determine the PageRank of a website. The PageRank again could give an indication of how strong a backlink is. Today, however, many different criteria are used to determine the quality of a backlink, the random surfer model is not really relevant for search engine optimization any more. Rather, it provides an insight into the early days of the Internet, when search engines such as Google looked for a way to determine the quality of a website.

Web Links