The SEO industry is so littered with myths that sometimes a little learning could be a dangerous thing.
Google recently tested a feature called Pros & Cons right below the meta description’s callout on the SERP.
Despite being very similar to rich results, these features represent a fine example of Google parsing plain text from a content copy on a page. In other words, this is one of the earliest excerpts of unstructured data retrieved by the search engine, showcased alongside structured data on the SERP.
🤖Google New Pros & Cons are not #RichResults but #Annotations.— Simone De Palma 🦊 (@SimoneDePalma2) July 1, 2022
The search engine can extract the most positive and negative n-grams from a page.
Yet this is not the only one as it all boils down to a 2021 patent.
More in this thread🧵
So SERP #Annotations have been “sold” to structured data manual mark up.— Simone De Palma 🦊 (@SimoneDePalma2) August 5, 2022
While the search engine #ml algorithms try to automatically scrape pros and cons, you are now enabled to explicitly provide this information via schema markup.
A few considerations🧵 https://t.co/xwldr3QssM
Before this goes too in-depth, let’s first learn what are these so-called “Annotations” along with their main features and the mechanics behind their organic generation on the search results page.
If you want to learn more about Annotations, I recommend you dig down the spectacular blog post covered by Marie Haynes on how to use annotations to create better content
What are Annotations?
In a nutshell,
Annotations refer to HTML strings retrieved from unstructured data sources, thus plain text from a content copy with the likes of either a product page or a landing page.
How are Annotations being generated?
Annotations are extracted from multiple text-based sources on a webpage.
Hence, you always want to take them with a solid grain of salt.
In short, the query engine detects an annotation from a page and determines whether to process it in real-time or store it in the search index. Next, a supervised machine learning model scores the annotations by type and ultimately ranks them by usefulness.
To expand a bit on the context, let’s consider the previous example about Larceny Bourbon.
Recent progress in the NLP machine learning model has enabled Google to parse unstructured data, that is human-readable text. As we can see from the screen grab below, the yellowed words are those transferred to the SERP as annotations.
This boils down to progress in advanced cluster analysis tasks carried out on entities and primarily n-grams. In fact, from the screenshot, it’s clear that the bespoke annotations stem from a sequence of characters (n-grams) completing the definition of specific entities.
Let’s take one specific source of annotations, “smooth and tasty bourbon”.
First, let’s tokenize the n-gram:
bourbon = entity smooth = connector type tasty = connector type and = proposition
Now we can raise an assumption about how Google addresses the n-gram as entities.
“tasty bourbon” = Entity “smooth bourbon” = Entity
In brief, the web page emphasizes the root entity “bourbon” with the enforcement yielded from the connectors “tasty” and “smooth”.
Google is reinforcing relationships among entities on the Internet following advancements in unsupervised cluster analysis tasks run by machine learning models.
You can perform advanced cluster analysis as well using Python and a few machine learning models. Learn how to run a semantic market analysis to inform your SEO strategy with this tutorial
Annotations Main Features
These HTML strings come up with a few remarkable traits which could potentially disclose room for action or improvement on one’s page content.
In layer man’s terms, annotations play a two-folded role resulting in benefits to both the search engine and the public. While resulting helpful to users by moving the needle of shallow search intent, they provide Google with hints on how to improve the search engine entity network.
Example of Annotations
Whether the SEO industry came up with Pros&Cons excerpts of this feature, annotations are actually available in different tastes and flavours.
To stick with the findings provided by the patent, annotations can fit the shoes of List Includes, Version change, Media Annotations or Editorial Reviews (see image below)
In the words of the patent, the above example of editorial reviews is:
“An annotation including a snippet of a user review that mentions running in conjunction with headphones. That users mention running in reviews for a product may result in a higher ranking for the particular product.”
However, there are a few other examples falling under the surface that Google officially doesn’t recognize. In fact, the patent may have missed out on the HTML table snippets perhaps due to a blunder or most likely to the mere flow of time from the patent’s release.
How to Optimize for Annotations
If you were looking for a handy shortcut to markup your web pages on the search results page with such attractive small lines of HTML strings, Google recently added an ad-hoc schema markup type and dropped a few guidelines about it.
The guidance to optimize for annotations points out:
- Annotations were first made eligible for appearance in Search only for editorial product review pages, but on October 25th software engineers at Google agreed to make the Pros & Cons properties available to Online stores as well to display product highlights.
- Ensure to disclose a few statements about the product (positive and negative) and enclose them into the
- Ensure the Pros&Cons are visible to users on the page where the information sits.
Here is a sample of code that you need to use to include the Pros&Cons structured data on your editorial product review page.
According to Google, it’s clearly not mandatory that you wrap up Pros&Cons around structured data as it will thrive to automatically deliver on the SERP such a piece of information where relevant.
However, the search engine seems to emphasize structured data prioritization over the extraction of unstructured data unwrapped within an ordinary page copy.
In the words of Google:
The search engine will prioritize supplied structured data provided by you over automatically extracted data
Rewinding on the Pros&Cons SERP feature, you might be well-equipped to infer that they represent HTML strings retrieved from pages with multiple reviews available, likely clustered by n-grams.
In practice, Google is surely making progress in parsing unstructured data or plain text on a page but it feels like it’s still a long way until the cows come home.
The current machine learning models gearing up the kernel of the search engine are still falling short of highly automated information retrieval tasks.
To put it straight, Google’s bid to mark up Annotations with structured data is a strategy for harvesting the search algorithms with product reviews data so that one day they’ll cut the mustard on information retrieval automation