🤓How To Automate Keyword Research with Google Autosuggest

Keyword research is the most popular SEO task that usually gets handed over to new starters.

As the gist behind keyword research is a piece of cake for anyone, Junior SEOs take on the honours. They also handle the underlying burdens of such a task. Any experienced SEO wouldn’t look forward to undertaking it.

You may be a new starter in the industry. You might also be a bit of an expert. Either way, you may be aware that you can use hundreds of tools to do decent keyword research. Especially as a beginner, you would spend plenty of time figuring out tips and tricks. These tips and tricks are behind the usage of expensive third-party SEO tools.

What if there was a method to automate keyword research that would save both time and tons of money?

In this post, I am going to take you through a handy process. It will ease your keyword research fatigue with an automated Python framework.


Requirements & Assumptions

Install and Import the Packages

To set up the framework you will need to install and import a few Python libraries.

!pip install requests_html

💡 Do not forget to append an exclamation mark before pip

Next, you need to import a row of Python packages to spot a light on the environment.

import requests
import numpy as np
import urllib
import json
import operator
import pandas as pd
from requests_html import HTML
from requests_html import HTMLSession
from urllib.parse import (parse_qsl, urlsplit)

Among the packages listed above, two of them will troubleshoot most of the pain points in the framework.

Connecting with Google Index

You can now set up the technical foundations of Google Index scraping. To do so, you need to trawl up the requests library into a def function.

def get_source(url):

    try:
        session = HTMLSession()
        response = session.get(url)
        return response
    except requests.exceptions.RequestException as e:
        print(e)

Hence, we make sure the next scraping is backed with a suitable parsing task with Urllib. In the interim, you can make a call to Google Index. This way, you will be able to submit a search query.

def get_results(query):
    query = urllib.parse.quote_plus(query)
    response = get_source("https://suggestqueries.google.com/complete/search?output=chrome&hl=en&q=" + query)
    results = json.loads(response.text)
    return results

Define your Search Query

Next, you can finally populate our hand-made search bar with a query of your choice

search_term = "Type your query"
results = get_results(search_term)
results

Formatting the Results

Once the machine has processed the Autosuggest bulky scraping and parsing prompts, we are going to format it. This will ensure the output reads as clearly as possible.

To spice up the framework, we add a variable named “Relevance“. This variable is based on TF-IDF of each term returned from the scraping. Relevance uses an automated estimation of a given query. This estimation depends on the frequency of showing up on the search results page.

def format_results(results):
    suggestions = []
    for index, value in enumerate(results[1]):
        suggestion = {'term': value, 'relevance': results[4]['google:suggestrelevance'][index]}
        suggestions.append(suggestion)
    return suggestions
formatted_results = format_results(results)
formatted_results

Adding Suffixes and Keyword Modifiers

Let’s fill up the model with a few toppings to cook up the final output.

We are going to add a pack of suffixes. This ensures you don’t miss out on any search query combination from Google Index for a given query. This pack will cover all the letters of the alphabet.

def get_expanded_term_suffixes():
    expanded_term_suffixes = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm','n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
    return expanded_term_suffixes
def get_expanded_term_prefixes():
    expanded_term_prefixes = ['what *', 'where *', 'how to *', 'why *','buy*', 'how much*','best *', 'worse *'                                   'rent*', 'sale*', 'offer*','vs*','or*']
    return expanded_term_prefixes

You can toy around with the keyword modifiers at your convenience. This depends on the funnel stage where your target content sits.

Expand the Search

At this point, you need to simmer the previous prompts so they merge together.

To do so, we need to call a def function to expand the terms.

def get_expanded_terms(query):

    expanded_term_prefixes = get_expanded_term_prefixes()
    expanded_term_suffixes = get_expanded_term_suffixes()   

    terms = []
    terms.append(query)

    for term in expanded_term_prefixes:
        terms.append(term + ' ' + query)

    for term in expanded_term_suffixes:
        terms.append(query + ' ' + term)

    return terms
 
get_expanded_terms(search_term)

And another one to expand search suggestions.

def get_expanded_suggestions(query):

    all_results = []

    expanded_terms = get_expanded_terms(query)
    for term in expanded_terms:
        results = get_results(term)
        results = format_results(results)
        all_results = all_results + results
        all_results = sorted(all_results, key=lambda k: k['relevance'], reverse=True)
        
    return all_results

Final Output

All you have to do now is to execute an extra chunk of code to print the output.

But first, let’s create a data frame. We will rename its columns by leveraging the Pandas library. This will allow you to easily download the full keyword report into a CSV file.

expanded_results = get_expanded_suggestions(search_term)
expanded_results_df = pd.DataFrame(expanded_results)
expanded_results_df.columns = ['Keywords', 'Relevance']
expanded_results_df.to_csv('keywords.csv')
expanded_results_df

This is roughly what you might get:

Excerpt of a raw keyword research output from a Pyton script

Adjust the Layout Style of the Dataframe

This is entirely optional but worth a try, given the mess coming from the above raw output.

First, you will need to paste the file path from the saved data frame and then reboot the Pandas data frame with a few CSS indications.

expanded_results_df = pd.read_csv('/content/keywords.csv') 
selection = ['Keywords','Relevance']
df = expanded_results_df[selection]
df.head(20).style.set_table_styles(
[{'selector': 'th',
  'props': [('background', '#7CAE00'), 
            ('color', 'white'),
            ('font-family', 'verdana')]},
 
 {'selector': 'td',
  'props': [('font-family', 'verdana')]},

 {'selector': 'tr:nth-of-type(odd)',
  'props': [('background', '#DCDCDC')]}, 
 
 {'selector': 'tr:nth-of-type(even)',
  'props': [('background', 'white')]},
 
]
).hide_index()

You should now get something like that

Excerpt from the definite keyword research output processed in Python

Unlock Unlimited Keyword Ideas with Python

As with everything, the devil is in the details and in SEO the challenge is to spot the low-hanging fruits provided by machine learning models.

However, do take these automation models with a grain of salt. They might be affected by outliers which could ultimately harm your SEO decision-making.


Summarise this post