🤓Google Autosuggest Keyword Research Automation-SEO Depths

August 17, 2024

Keyword research is the most popular SEO task that usually gets handed over to new starters.

As the gist behind keyword research is a piece of cake for anyone, Junior SEOs take on the honours. They also handle the underlying burdens of such a task. Any experienced SEO wouldn’t look forward to undertaking it.

You may be a new starter in the industry. You might also be a bit of an expert. Either way, you may be aware that you can use hundreds of tools to do decent keyword research. Especially as a beginner, you would spend plenty of time figuring out tips and tricks. These tips and tricks are behind the usage of expensive third-party SEO tools.

What if there was a method to automate keyword research that would save both time and tons of money?

In this post, I am going to take you through a handy process. It will ease your keyword research fatigue with an automated Python framework.

The script is designed to scrape Google Autosuggest in bulk. It delivers unlimited search queries. These queries are suffixed with a group of search intent modifiers of your choice.

Table of Contents

Requirements & Assumptions

Run the script on Google Colab to prevent your CPU from overloading, in case you use different notebooks.
Avoid scraping multiple times during a keyword research session. This will prevent your IP address from being blocked by Google’s firewall.

Install and Import the Packages

To set up the framework you will need to install and import a few Python libraries.

First, you will need to install requests, a Python library that lets you scrape HTML elements from webpages.

!pip install requests_html

💡 Do not forget to append an exclamation mark before pip

Next, you need to import a row of Python packages to spot a light on the environment.

import requests
import numpy as np
import urllib
import json
import operator
import pandas as pd
from requests_html import HTML
from requests_html import HTMLSession
from urllib.parse import (parse_qsl, urlsplit)

Among the packages listed above, two of them will troubleshoot most of the pain points in the framework.

Urllib is a Python library that will help us parse the content scraping from the targeted web pages.

Numpy is a Python library that we’ll use to set up a user-friendly visual array of the outputs.

Connecting with Google Index

You can now set up the technical foundations of Google Index scraping. To do so, you need to trawl up the requests library into a def function.

def get_source(url):

    try:
        session = HTMLSession()
        response = session.get(url)
        return response
    except requests.exceptions.RequestException as e:
        print(e)

Hence, we make sure the next scraping is backed with a suitable parsing task with Urllib. In the interim, you can make a call to Google Index. This way, you will be able to submit a search query.

def get_results(query):
    query = urllib.parse.quote_plus(query)
    response = get_source("https://suggestqueries.google.com/complete/search?output=chrome&hl=en&q=" + query)
    results = json.loads(response.text)
    return results

Define your Search Query

Next, you can finally populate our hand-made search bar with a query of your choice

search_term = "Type your query"
results = get_results(search_term)
results

💡 Find out how Google may interpret your search query

Formatting the Results

Once the machine has processed the Autosuggest bulky scraping and parsing prompts, we are going to format it. This will ensure the output reads as clearly as possible.

To spice up the framework, we add a variable named “Relevance“. This variable is based on TF-IDF of each term returned from the scraping. Relevance uses an automated estimation of a given query. This estimation depends on the frequency of showing up on the search results page.

def format_results(results):
    suggestions = []
    for index, value in enumerate(results[1]):
        suggestion = {'term': value, 'relevance': results[4]['google:suggestrelevance'][index]}
        suggestions.append(suggestion)
    return suggestions
formatted_results = format_results(results)
formatted_results

Adding Suffixes and Keyword Modifiers

Let’s fill up the model with a few toppings to cook up the final output.

We are going to add a pack of suffixes. This ensures you don’t miss out on any search query combination from Google Index for a given query. This pack will cover all the letters of the alphabet.

def get_expanded_term_suffixes():
    expanded_term_suffixes = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm','n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
    return expanded_term_suffixes

Next, we pack up the settings with a bunch of keyword modifiers as a hint of search intent

def get_expanded_term_prefixes():
    expanded_term_prefixes = ['what *', 'where *', 'how to *', 'why *','buy*', 'how much*','best *', 'worse *'                                   'rent*', 'sale*', 'offer*','vs*','or*']
    return expanded_term_prefixes

You can toy around with the keyword modifiers at your convenience. This depends on the funnel stage where your target content sits.

Expand the Search

At this point, you need to simmer the previous prompts so they merge together.

To do so, we need to call a def function to expand the terms.

def get_expanded_terms(query):

    expanded_term_prefixes = get_expanded_term_prefixes()
    expanded_term_suffixes = get_expanded_term_suffixes()   

    terms = []
    terms.append(query)

    for term in expanded_term_prefixes:
        terms.append(term + ' ' + query)

    for term in expanded_term_suffixes:
        terms.append(query + ' ' + term)

    return terms
 
get_expanded_terms(search_term)

And another one to expand search suggestions.

def get_expanded_suggestions(query):

    all_results = []

    expanded_terms = get_expanded_terms(query)
    for term in expanded_terms:
        results = get_results(term)
        results = format_results(results)
        all_results = all_results + results
        all_results = sorted(all_results, key=lambda k: k['relevance'], reverse=True)
        
    return all_results

Final Output

All you have to do now is to execute an extra chunk of code to print the output.

But first, let’s create a data frame. We will rename its columns by leveraging the Pandas library. This will allow you to easily download the full keyword report into a CSV file.

expanded_results = get_expanded_suggestions(search_term)
expanded_results_df = pd.DataFrame(expanded_results)
expanded_results_df.columns = ['Keywords', 'Relevance']
expanded_results_df.to_csv('keywords.csv')
expanded_results_df

This is roughly what you might get:

Adjust the Layout Style of the Dataframe

This is entirely optional but worth a try, given the mess coming from the above raw output.

First, you will need to paste the file path from the saved data frame and then reboot the Pandas data frame with a few CSS indications.

expanded_results_df = pd.read_csv('/content/keywords.csv') 
selection = ['Keywords','Relevance']
df = expanded_results_df[selection]
df.head(20).style.set_table_styles(
[{'selector': 'th',
  'props': [('background', '#7CAE00'), 
            ('color', 'white'),
            ('font-family', 'verdana')]},
 
 {'selector': 'td',
  'props': [('font-family', 'verdana')]},

 {'selector': 'tr:nth-of-type(odd)',
  'props': [('background', '#DCDCDC')]}, 
 
 {'selector': 'tr:nth-of-type(even)',
  'props': [('background', 'white')]},
 
]
).hide_index()

You should now get something like that

Unlock Unlimited Keyword Ideas with Python

Now you can get wind of an unlimited deck of search queries.

As with everything, the devil is in the details and in SEO the challenge is to spot the low-hanging fruits provided by machine learning models.

However, do take these automation models with a grain of salt. They might be affected by outliers which could ultimately harm your SEO decision-making.

Simone De Palma

Technical SEO Specialist

Simone De Palma is an SEO Specialist at Omnicom and the founder of SEO Depths.

He graduated in Marketing and Management from Università IULM before completing a degree in Digital Marketing and Data Science at Leeds Beckett University.
Simone has worked as an SEO Specialist in digital agencies in Italy and the United Kingdom and he’s a contributor for the Search Engine Land.

When he’s away from his double screens, he enjoys cooling down with a refreshing swim at the pool. You could find him exploring art museums or enjoying the company of a classic romance.

🤓How To Automate Keyword Research with Google Autosuggest