How to Measure the Impact of Meta Descriptions on Organic Rankings

Reading time: 13 Minutes

SERP features are influenced by the search intent concealed within a search term. There are thousands of snippets the search engines generate that are retrieved from HTML elements, including descriptive information.

As a result, meta tags such as title links and meta descriptions enclosed in rightful HTML markups can be used to pull rich results on the search results page (SERP).

Although snippets are automatically generated, site owners have two main ways to pitch their content to gain a rich snippet on their verticals:

  • Adding structured data, with schema type like Pros & Cons to establish a semantic connection with Google as long as a piece of information is understood.
  • Optimizing the <meta> descriptions. This is the main meta tag leveraged to pull rich results, as long as it positively matches the search intent.
    In case this doesn’t verify, we have learned Google’s machine learning may pull a blended set of semi-structured and unstructured pieces of information from the web page content.

    In fact, this is the common thread that led to a recent study reporting Google rewrites meta descriptions 70% of the time.

What Is A Meta Description?

The meta description is an HTML element that provides search engines and searchers a summary of what a webpage is about.

Raw excerpt of a meta description

In the words of Google:

A meta description tag informs and interests users with a short, relevant summary of what a particular page is about.

It is displayed on the search results pages right below the title of the page.

SERP meta description, title link and URL

Although Meta descriptions are not a confirmed ranking factor, their implication is still relevant to entice clicks and improve CTR which, in turn, may translate into a boost in organic traffic performance.

However, it is fair to consider neither clicks nor CTR is the ideal SEO KPI to look at when measuring overall organic performance, nonetheless the SEO ROI. CTR doesn’t hold a significant impact on your site’s organic conversions because it empowers the very first phases of the MOFU (middle-of-the-funnel) stage of the marketing funnel.

In other words, CTR is a weak KPI to rely on when assessing business-oriented objectives.

The high exposure to pesky click-baiting techniques hampers the robustness of clicks and CTR. This means optimizing meta descriptions to expect a site-wide improvement is likely not the ideal SMART objective to set.

On the other hand, this on-page SEO activity may result beneficial when focusing on a page level. Boosting organic performance and web rankings involves improving clicks and CTR of a single page or a set of page templates (e.g /blog/).

The best way to do that is to make sure you’re writing compelling meta tags that capture the searcher’s attention straight away.

According to the recent Zero-Click study from SEMrush, 45% of American users took less than 5 seconds to click on their desired result in the SERPs. If you were to influence CTR on a page level, this emphasizes the need to write attention-grabbing page titles and descriptions.

This is especially relevant if you are an SEO working content-side. How many times have you wondered how to translate your optimization efforts into traffic upticks and ranking improvements?

Perhaps your most intimate question sounds more based:

How can you influence organic rankings with my meta descriptions for a given search query?

There are unlimited perspectives to look at, and this post provides a method to study the correlation strength between meta descriptions and search queries

💡BONUS

Title links play a role in influecing CTR. Learn how to automate title tags rewriting with this Python framework


What this framework does

The following Python framework will compare a corpus of text (meta descriptions) to the top 5 search queries retrieved from the Google Autosuggest database, which you can comfortably scrape using this Python framework

The goal is to find the most similar meta descriptions relating to a specific search query so that we can report on the influence meta descriptions play on matching the search intent behind a query

TL;DR
🦊 Run a similarity benchmark between Queries vs Meta Descriptions 
🦊 Find which competitor satisfies the search intent with the best Meta Descriptions
🦊 Improve the writing of your Meta Descriptions to influence CTR and overcome competitors

How Does it Work?

Before kicking off with the following framework, there are a few considerations to make that will give you an idea of how this model works.

  • Scraping top search queries with ecommercetools, a library to retrieve the top search queries by relevance from the Google Autosuggest database. The search term I will investigate is: “is the coffee brown?

  • Using SERP Api to scrape a list of meta descriptions for a given term. I would highly suggest using this API as it prevents you from scraping the SERPs straight away and perhaps bogging down in having your IP address returned as blocked from Google.

  • Preprocessing data from both data frames before striking the meta description similarity

  • Firing up the Sentence Transformers model and collating a quick correlation analysis.

  • Merging the top search query and the scraped organic results into a single data frame.

💡BONUS💡
As a freemium service, you get up to 100 searches per month, and then you can decide whether to upgrade your credits.

Install and Import Dependencies

To start off the model, we need to install some dependencies

LibraryDescription
sentence_transformers a Python framework for state-of-the-art sentence, text and image embeddings provided by Hugging Face
ecommercetools a data science toolkit for those working in technical ecommerce, marketing science, and technical seo
google-search-resultsthe official Serp Api library that allows you to scrape results from Google search engine

Next up, we need to import the required libraries to fulfill a few implicit operations.

These are described after the # in the script below

!pip install sentence_transformers
!pip install ecommercetools
!pip install google-search-results

from sentence_transformers import SentenceTransformer, util
import torch

#scraping
import requests
from serpapi import GoogleSearch
import urllib
import urllib.parse
import json
from urllib.parse import (parse_qsl, urlsplit)
from requests_html import HTML
from requests_html import HTMLSession


#data manipulation
import pandas as pd
import numpy as np


#libraries for preprocessing tasks
from gensim.parsing.preprocessing import remove_stopwords
import string
from nltk.stem.snowball import SnowballStemmer
from nltk.tokenize import word_tokenize


#Download once if using NLTK for preprocessing
import nltk
nltk.download('punkt')

Scrape Top Search Queries

Once the framework stack is set up, we can start off with fetching the top search queries for a term of our choice.

I chose to investigate the search term is coffee brown? after implementing the payload to fire up ecommercetools.

from ecommercetools import seo
import pandas as pd

suggestions = seo.google_autocomplete('is coffee brown?', include_expanded=True)
queries = pd.DataFrame(suggestions)

#data cleaning
results = queries.drop_duplicates('term')
results.to_csv('results.csv')
results.head(10)
Ecommercetools search query scraping for: is coffee brown?

This data frame represents the first milestone of our journey. Keep your eyes peeled on these terms as we’re going to use them later on in the context of the similarity calculations with Sentence Transformers.

Collect Meta Descriptions

This round is all about fetching the organic search results in a bid to extract the factual meta descriptions.

Again, the search is drilled on the search term designed for the analysis

from serpapi import GoogleSearch

serp_apikey = '####' 

params = {
    "engine": "google",
    "q": "is coffee brown?",
    "location": "United Kingdom",
    "google_domain": "google.com",
    "gl": "uk",
    "hl": "en",
    "num": 10,
    "api_key": serp_apikey
}

client = GoogleSearch(params)
data = client.get_dict()

# access "organic results"
df = pd.DataFrame(data['organic_results'])
df.to_csv('results.csv', index=False)

Once we’ve sent our search query to SERP Api to loop through the organic results, we need to perform a bit of data cleaning before returning the output.

#remove special characters from values
Data = pd.read_csv('/content/results.csv')
Data['about_this_result'] =  Data['about_this_result'].str.replace(".?{'source':|}|{'description':|'|\[|\]", "")
Data['snippet_highlighted_words'] =  Data['snippet_highlighted_words'].str.replace(".?\'|\[|\]", "")
Data = Data.drop(['position','displayed_link','rich_snippet','about_page_link','sitelinks','about_page_serpapi_link','cached_page_link','related_pages_link','date'], axis=1)
cols = ['Title','Link','Meta Description','Highlighted Result','About this Result']
Data.columns = cols

#fill NaN value 
Data = Data.fillna(0)
Data.isnull().sum()
Data.to_csv('organic_results.csv',index=False)
Data
Scraped search results for the query: is coffee brown?

💡BONUS

Other than scraping the SERPs, you can leverage the Serp API to detect title tag rewriting.

Next, we make sure to keep only the Meta Description column.

This step is required as we will use the new data frame to compile the comparison with Sentence Transformers against the top 5 search queries.

SERP_One = pd.read_csv('organic_results.csv')
Meta_description = pd.DataFrame(SERP_One, columns=['Meta Description'])
Meta_description.to_csv('meta.csv', index=False)
Meta_description

Data Preprocessing

This is the stage of the process where we normalize the text corpora to minimize the impact of outliers. To serve this purpose, we’ll use the regex library (re) to tweak the meta description corpora for good.

# Load the regular expression library
import re

# Remove punctuation
Meta_description['Meta_Description_processed'] = \
Meta_description['Meta Description'].map(lambda x: re.sub('[,\.!?]', '', x))

# Convert the titles to lowercase
Meta_description['Meta_Description_processed'] = \
Meta_description['Meta_Description_processed'].map(lambda x: x.lower())

# Print out the first rows 
Meta_description['Meta_Description_processed'].head()

If you aim to enhance the visualization of our preprocessed data, you could put together a wordcloud highlighting the most used entities stemming from the collected meta descriptions.

# Import the wordcloud library
from wordcloud import WordCloud

# Join the different processed titles together.
long_string = ','.join(list(Meta_description['Meta_Description_processed'].values))

# Create a WordCloud object
wordcloud = WordCloud(background_color="white", max_words=1000, contour_width=3, contour_color='steelblue')

# Generate a word cloud
wordcloud.generate(long_string)

# Visualize the word cloud
wordcloud.to_image()
wordcloud

At a first glance, we can assume the meta descriptions are on the right way to successfully respond to the search intent.

Entities from the wordcloud show a clear intention to expand on the properties of brown coffee and the reasons behind it.

Strike Similarity Match with Sentence Transformers

This stage aligns with the turning point of this tutorial.

We are going to calculate a similarity score between the top 5 search queries and the scraped meta descriptions.

What you have to do is to copy and paste the meta description texts from the Meta_description data frame we created earlier into a list called corpus.

Sentence Transformers will be powered with a standardized model called all-MiniLM-L6-v2 . However, feel free to test with the others available.

embedder = SentenceTransformer('all-MiniLM-L6-v2')

corpus = [
    'Coffee is a brownish color that is a representation of the color of a roasted coffee bean. Different types of coffee beans have different colors when',
    'While it is not a chemical, but yet a chemical reaction known as the maillard that gives coffee it is brown color. The process coined by the French scientist',
    'The coffee brown spectrum ranges from light beige to black, and the color of coffee is determined by the roast level of the coffee bean.',
    'The coffee beans we are used to seeing, the brown ones with a delightful flavor, are roasted. Raw coffee beans have a different color and smell',
    'Coffee brown hair color is far from dull and surprisingly versatile. The beauty of this dark brown shade is that it offers a spectrum of ',
    'Roasted coffee beans are often a dark brown with tinges of red, orange or green. Brewed coffee is perceived as black but is actually a very dark',
    'It is roast level. Darker roasts produce a darker colored looking brew but it is much less complex in terms of flavor. Those pigments that develop',
    'Coffee Brown has the hex code #8A624A. The equivalent RGB values are (138, 98, 74), which means it is composed of 45% red, 32% green and 24% blue'
]
corpus_embeddings = embedder.encode(corpus, convert_to_tensor=True)

# Query sentences:
queries = ['what is coffee brown color', 'who is coffee brown', 'what is coffee brown', 'where is brown coffee from', 'coffee can brown bread', 'why is my coffee brown not black',
           'how is coffee brown', 'best coffee brownie recipe','cheap brown coffee table','worst coffee brand']

# Find the closest 5 sentences of the corpus for each query sentence based on cosine similarity
top_k = min(5, len(corpus))

#create an empty list
query_result = list()  
for query in queries:
    query_embedding = embedder.encode(query, convert_to_tensor=True)

    cos_scores = util.cos_sim(query_embedding, corpus_embeddings)[0]
    top_results = torch.topk(cos_scores, k=top_k)

    for score, idx in zip(top_results[0], top_results[1]):
        corpus[idx], "(Score: {:.4f})".format(score)
        query_result.append([query, corpus[idx], score])

#dump our findings into a Pandas data frame
df2 = pd.DataFrame(query_result,columns=['Query','Meta Description','Score']) 
df2.to_csv('results.csv',index=False)
df2

Next, we create some data frames to enclose the matching of every single top search query with their related batch of meta descriptions.

query_1 = df2.loc[df2['Query'] == 'what is coffee brown color']
query_2 = df2.loc[df2['Query'] == 'who is coffee brown']
query_3 = df2.loc[df2['Query'] == 'what is coffee brown']
query_4 = df2.loc[df2['Query'] == 'where is brown coffee from']
query_5 = df2.loc[df2['Query'] == 'coffee can brown bread']
query_6 = df2.loc[df2['Query'] == 'why is my coffee brown not black']
query_7 = df2.loc[df2['Query'] == 'how is coffee brown']
query_8 = df2.loc[df2['Query'] == 'best coffee brownie recipe']
query_9 = df2.loc[df2['Query'] == 'cheap brown coffee table']
query_10 = df2.loc[df2['Query'] == 'worst coffee brand']

And decide which data frame to drill our investigation on by saving the output.

As an example, I am going to focus on “what is coffee brown” and perform a bit of data cleaning to improve the final visualization.

query_3.to_csv('what is coffee brown.csv', index=False)
coffee_brown = pd.read_csv('what is coffee brown.csv')
coffee_brown['Score'] = coffee_brown['Score'].str.replace("tensor\(|\)", "")
coffee_brown['Score'] = coffee_brown['Score'].astype(float)

💡BONUS💡
If you’re curious about how another search query relates to their batch of meta descriptions, you can always create a data frame for another query.

Just make sure you use one of the different functions (e.g query_10) and chage the name to coffee_brown into the search query you’re intended to browse

Measuring the Impact of Search Queries on Meta Descriptions

The last step is to merge the results from the scraped SERP with the Sentence Transformers scores on the Meta Descriptions corpora.

After cleaning up the data frame from duplicate rows, NaN values and unwanted columns (e.g ‘Highlighted Result‘, 'About this Result') we’ll be able to print our final output to draw a few assumptions on the outcome of the similarity analysis.

#append title links and URL to the similarity output
dat1 = pd.concat([coffee_brown, SERP_One], axis=1)
#data cleaning and reshuffle of the merged data frame
dat1 = dat1.drop(['Highlighted Result',	'About this Result'], axis=1)
dat1 = dat1.loc[:,~dat1.columns.duplicated()].copy()
#renaming columns
dat1 = dat1[['Query', 'Title', 'Meta Description', 'Link', 'Score']]
cols = ['Query','Title','Meta Description','URL','Score']
dat1.columns = cols
#remove NaN value 
day = dat1.dropna()
#save the output and print the result
day.to_csv('output.csv',index=False)
day

The correlation result from search query and meta descriptions text comparison

Using what is coffee brown as guinea pig fo our analysis, we observe Wikipedia peaking up the SERP with a meta description that highly correlates the search intent behind the query.

On the other end, players such as byrdie.com seem to struggle to provide a meaningful meta description to fill up the search intent.

It’s not big science after all. As Wikipedia provides a straight explanation on brown coffee, byrdie.com looks like rambling around before providing a definite answer.

Conclusion

With the aid of Sentence Transformers, I was able to output the top 5 most similar sentences in a corpora of Meta Descriptions and explore room for correlation.

The correlation between the type of Search Query on the Meta Description corpora was unsurprisingly high for those websites ranking higher on the SERP. This means the degree of effectiveness of a meta description and the related search query engage with a positive correlation.

This analysis provided a neat method to help you prune your SEO content strategy by shining a light on the correlation strength between a search query and related meta descriptions from a semantic perspective.

This can help improve your website’s topic modeling to devise improved reviews of your site’s structure

Further Readings

Feel free to refer to a few additional in-depth resources to learn more about the topics discussed in this post and the method used to build the Python framework.

1️⃣ Shall we consider CTR as a ranking factor or just a secondary signal?

Find out more in this post from the Search Engine Journal: the biggest mystery of Google’s algorithm: Everything ever said about clicks, CTR and bounce rate

2️⃣ What is the most valuable SEO KPI worth the risk of tracking? Ahrefs has got a based answer with a blog post concerning 12 SEO KPIs You Should (And Shouldn’t) Track

3️⃣ Sentence Transformers machine learning model applied to semantic search

Never Miss a Beat

Subscribe now to receive weekly tips about Technical SEO and Data Science 🔥