Understanding Entities with the Knowledge Graph Search API

Entities. Admittedly, a difficult topic to grasp, but one that should be familiar to everyone involved in SEO. But what is it all about? This article explains which updates brought entities into the SEO world and how everyone can easily retrieve them via the Knowledge Graph Search API.

As is often the case, a look into the past helps to understand the importance of entities for search engine optimization today and in the future. SEO is a comparatively young discipline. It has been over 20 years since Google launched in 1998. Its success has certainly been rapid: too rapid to keep up with a resourceful SEO scene that quickly found suitable levers for the desired rankings.

Links are the most important signal? Well, no problem, that's what link farms are for. Keyword stuffing and one landing page per long-tail keyword works well? Great, because cheap content with the desired keyword density is easy to come by.

That's the way we did things once upon a time. Gone are the days when websites were created for search engines and not for users. In the meantime, Google has gotten the major building work under control and can now take care of fine-tuning the search algorithm. Understanding human language and its context plays a crucial role here—and so do entities.

Google is on its way to becoming an answer engine

Google understands human language and its concepts better than ever before and is using these capabilities to slowly but surely make the journey from a pure search engine to a fully fledged answer engine.

Even now, there is often no need to leave the search results page, for example, because the answer you are looking for already appears in an info box. This was made possible by the following updates to the search algorithm, which have enabled Google to take decisive steps on its way to becoming an answer engine.

2012: Introduction of the Knowledge Graph
2013: Hummingbird Update
2015: Rankbrain Update (based on Hummingbird)
2019: BERT Update

Things, not strings: Launching the Knowledge Graph

Things, not strings: this was the slogan of Knowledge Graph when it was launched in 2012. More loosely translated, this means "entities, not search terms”.

Defining entities: But what exactly is an entity? Let's take a look at the definition before we dive deeper into the topic.

If you are already familiar with the topic of data modeling, this is certainly not the first time you have heard of entities. In computer science, entities are single information objects that can be uniquely defined. An entity can be a tangible material object, a person, or an immaterial, abstract state. For each entity, further information is stored such as its properties (attributes) as well as its relationships to other entities.

Example: Attributes such as age or place of birth are stored for the entity of the person Joe Biden. In addition, a relationship is established to the more abstract entity of the office of the US President.

Figure 1: Entity of Joe Biden with attributes and relationship to the entity of the US President.

Let's go back to the introduction of the Knowledge Graph and the motto "Things, not strings" or "Entities, not search terms". So, what matters for relevant search results is not the letters typed per se, but that Google understands what we as searchers mean by them.

Let's take as an example a term that Olaf Kopp has already taken a closer look at in his detailed article on entity-based searching: "jaguar". Car lovers think first of the sports car brand, jungle fans of the animal, and fans of the new wave of British heavy metal think of the Bristol band of the same name.

Figure-2-Search-Results-and-Knowledge-Panel-of-Jaguar-Cars

Figure 2: Jaguar Cars search results and knowledge panel.

Figure-3-Search-Results-and-Knowledge-Panel-of-the-Animal-Jaguar

Figure 3: Search results and knowledge panel for the jaguar animal.

Figure-4-Search-Results-and-Knowledge-Panel-of-the-Band-Jaguar

Figure 4: Search results and knowledge panel for the band Jaguar.

All three are separate entities and each is stored in the Knowledge Graph with its individual properties (attributes): e.g. headquarters (attribute of Jaguar Cars) - life expectancy (attribute of the animal) - members (attribute of the band). These attributes not only help to delineate the otherwise identically named entities, but are also played out in the search results, for example in the form of a large side knowledge panel.

The relationships between entities also play an important role. As mentioned in the example above, the office title "President of the United States" is in itself an entity associated with multiple person entities. Through this relationship, the knowledge panel of U.S. President Abraham Lincoln logically displays other Presidents of the United States in the "Also often searched for" area.

Figure-5-Knowledge-Panel-of-Abraham-Lincoln

Figure 5: German knowledge panel of Abraham Lincoln as well as related entities of other US presidents.

These connections also show where so-called graph databases—these store the information from the Knowledge Graph—differ from traditional, so-called relational databases. Let's take as an example a relational database that contains a table with all former governors of the state of California, another table with the filmography of all known Hollywood stars, and a third table with all Austrian citizens.

Now, to query the attributes of Arnold Schwarzenegger's entity, information from three different tables must be merged. With a graph database, on the other hand, one query of the Arnold Schwarzenegger entity is sufficient to obtain all relevant properties and relationships.

Figure-6-Search-Results-and-Knowledge-Panel-of-Arnold-Schwarzenegger

Figure 6: German search results and knowledge panel of Arnold Schwarzenegger.

Entities with irregular, complex information and relationships—like Arnold Schwarzenegger—can thus be represented more easily and retrieved more quickly. Since speed is of the essence in web searches, it makes sense that Google has opted for this type of data storage with the Knowledge Graph.

How can Google understand our language?

Entities and their relationships are also a tool for Google to understand the context of our language. This is where Hummingbird and Rankbrain come into play. The Rankbrain update of 2015 in particular enabled Google to process search queries that the algorithm had not yet encountered. Therefore, it makes no difference whether the user enters "where can I get pizza in Berlin", "pizza in Berlin" or "pizza Berlin" into the search bar. Google is able to recognize the two crucial entities:

Pizza with the Knowledge Graph ID /m/0663v
Berlin with the Knowledge Graph ID /m/0156q

This breaks down human language into a format that machines can understand. Since the local pizza restaurants in Berlin are certainly related to both entities, they are played out in the search results.

But how do the entities, their properties and their relationships get into the Knowledge Graph in the first place? Google uses machine learning and natural language processing for this purpose. Here, the texts and databases available on the World Wide Web (e.g. from Wikipedia or Wikidata) are analyzed to identify entities in them. A demo of Google's Natural Language API shows how the whole thing can look.

Figure 7: Demo of the Natural Language API identifying entities in text about Arnold Schwarzenegger.

Natural language processing was also the central theme of the 2019 BERT update. Through the BERT model, the context of texts can be classified even better. On the one hand, this helps Google to identify the true user intent behind search queries, and on the other hand, it also helps to fill the Knowledge Graph with extracted information from the text corpus on the World Wide Web.

A look under the hood with the Knowledge Graph Search API

There is one question we haven't answered yet: for the search term "jaguar," why is the car manufacturer now displayed first in the knowledge panel and not the animal? To find the answer, let's take a look under the hood of the Knowledge Graph: at the Knowledge Graph Search API.

In the official documentation you can easily try out which entities are stored in the Knowledge Graph for which search terms. You can do this in the "Reference" section of the "Try this API" window.

We enter the desired search term in the "query" field and it is also useful to specify the language in the "languages" field ("de" for German). With a click on "Execute" we receive the requested data in JSON format.

JSON output of the Knowledge Graph Search API for the term “jaguar”.

Figure 8: Demo of the Knowledge Graph Search API with the term "jaguar".

JSON-LD (JavaScript Object Notation for Linking Data) is a common data format through which data can be exchanged between different applications in a simple text form. In the world of SEO, JSON plays a key role in the markup of structured data via Schema.org, as it is the format recommended by Google. One advantage: you can markup the structured data in JSON format in a separate file and don't have to add it individually in the HTML markup, as with the Microdata format, for example.

In the JSON output of the Knowledge Graph Search API we can then see, albeit somewhat hidden, the entries of the individual entities. Here we can also see that each entity has a unique ID (see arrows).

Figure-9-German-JSON-Output-of-Knowledge-Graph-Search-API-for-Term-jaguar

Figure 9: JSON output of the Knowledge Graph Search API for the term "jaguar".

Using these IDs, we can retrieve the URL of any entity, even if no knowledge panel is currently displayed for it in the search results. To do this, simply append the ID to the following URL path: https://www.google.de/search?kgmid=

Jaguar Cars: https://www.google.de/search?kgmid=/m/012×34
Jaguar Animal: https://www.google.de/search?kgmid=/m/0449p
Jaguar Band: https://www.google.de/search?kgmid=/m/01r3qt2

A closer look at the data also shows that the automobile manufacturer Jaguar Cars for the query "jaguar" occurs first (resultScore = 6677), while the entity of the jaguar as an animal follows further down (resultScore = 4569). How exactly Google does this ranking is not entirely clear. But we can assume that for the search term "jaguar" more users expect information about the car than about the animal.

At the same time, there is also no guarantee that the entry with the highest result score will appear as the knowledge panel in the search results. This applies, for example, to the term "essen" (to eat). So, although the entity for food in the sense of food in the output of the API has the highest result score, the search results nevertheless show the knowledge panel for the city of Essen.

Apparently, local results are preferred when displaying the knowledge panel. The search result for "essen" at www.google.at also shows this: In Austria, the German city of Essen has less relevance, which is why Google lets the user decide which entity to display.

Figure-10-JSON-Output-of-Knowledge-Graph-Search-API-for-German-Term-essen

Figure 10: JSON output of the Knowledge Graph Search API for the term "essen".

Figure-11-German-Search-Results-and-Knowledge-Panel-for-Term-essen

Figure 11: German search results and knowledge panel for the term "essen".

Figure 12: Austrian search results and knowledge panel selection
for the term "essen".

Do not confuse: Knowledge Panel vs. Google My Business

At this point, we should clarify one more important distinction. There is a risk of confusing the knowledge panel with the Google My Business panel. Let's take "Dept" as an example.

The knowledge panel of Dept is currently only displayed in Germany for the specific search query "dept agency", but not for the term "dept" alone. However, if the search "dept" takes place in Berlin, the locally relevant Google My Business entry is prominently placed as a panel. The difference is then also clearly visible in the Berlin search results for "dept agency". Here, both the knowledge panel and the Google My Business entry appear.

Figure-13-Search-Results-with-Location-Munich-for-Term-dept

Figure 13: Search results with location Munich for the term "dept".

Figure-14-Search-Results-with-Location-Berlin-for-Term-dept

Figure 14: Search results with location Berlin for the term "dept".

Figure-15-Search-Results-with-Location-Munich-for-Term-dept-agency

Figure 15: Search results with location Munich for the term "dept".

Figure-16-Search-Results-with-Location-Berlin-for-Term-dept-agency

Figure 16: Search results with location Berlin for the term "dept agency".

The knowledge panel can often be recognized by its own built-in link icon. By clicking this, you can quickly share the content and will be redirected to the entity URL already mentioned above.

Relevance of entities for our daily SEO routine

Entities are difficult to grasp, partly because the topic is rarely a priority in everyday SEO. Nevertheless, it's clear that Google is on its way to becoming an answer engine and entities are a basic building block for our future search experience. Therefore, entities are already helping us understand how Google thinks and why search results look the way they do.

Keyword research with entities

At the same time, we can also incorporate this information into everyday keyword research. In particular, we should consider whether targeting makes sense if there is a strong entity behind the keyword. Google's goal in such cases is to provide the most important information directly in the search results: that is, without requiring the user to click on a specific website.

To illustrate this, we can take an example from sports, using the Search term "jaguars" (Attention: plural here!). In fact, this is assigned to the entity of the Jacksonville Jaguars football team: https://www.google.de/search?kgmid=/m/043vc

Figure-17-German-Search-Results-for-Term-jaguars-and-its-Entity-Jacksonville-Jaguars

Figure 17: German search results for the term "jaguars" and the entity Jacksonville Jaguars.

The user gets important facts about the team, the last game results, the current squad and the ranking in the standings directly in the knowledge panel and an additional info box. Among the "normal" ten blue links are sports websites with the same information, but they are very unlikely to be clicked on.

By the way: The screenshot was taken before the Superbowl on February 8, 2021 (i.e. before the end of the season). At that time, the current scores had a special relevance. Since the season ended, the prominent infobox is no longer played.

Brand building with entities

Every brand and company is likely to have an interest in building its own entity. For many major brands, this has already been accomplished and a knowledge panel appears in the search results.

Good to know: as a person or brand owner, you can also lay claim to your own knowledge panel and edit the information yourself. Similar to Google My Business, this requires a short verification, for example via Google Search console or a linked social media account.

Figure-18-Knowledge-Panel-of-Jaguar-Cars-with-Button-Claim-this-Panel

Figure 18: Knowledge panel of Jaguar Cars with claim button.

If a knowledge panel is not yet displayed for a brand, entries in Wikipedia and Wikidata could help, provided of course that they are accepted by the moderators. It should be noted that entities should have social relevance from Google's perspective. Only if this criterion is met, is there a chance of a separate knowledge panel in the search results. A good way to get there are the well-known measures for building a brand, such as PR work or publications. These underpin your own expertise and authority. So the same applies here: a strong brand is the key to success.

Ryte users gain +93% clicks after 1 year. Learn how!

Book a demo

Published on May 10, 2021 by Johanna Maier

Johanna Maier

Johanna Maier is a Junior SEO Consultant at the internationally up-and-coming digital agency Dept. During her studies, she came into contact with online marketing through internships and found her home with the SEO traineeship at Dept. Her fascination: the interdisciplinary nature of search engine optimization and the interplay of technology, creativity, and data.

Ryte users gain +93% clicks after 1 year. Learn how!

Book a demo