Artificial intelligence — the new frontier.
It seems like only yesterday that A.I. had surfaced as the new promise for technological advancement. Google themselves have driven down their energy costs using AI by a huge amount, just by letting machine learning do its thing. Everyone has heard about AlphaGo, AlphaZero, and even AlphaStar taking the mental-sporting field by storm, with ChatGPT causing its own quake in the AI scene. But few truly realize what goes into these deep learning achievements, and feel it’s a big mystery that only Big Tech can solve. So why does it feel like AI is so out of reach for the average person or business? Well, I think it’s because there just isn’t enough accessible information out there. Just like everything else worthwhile, it takes a bit of time, and effort, to learn a new skill. I’m here to help you understand the basics, and how to leverage this technology today in improving your Google rankings with Dataiku.
The A.I. Basics
Although they’re spoken of interchangeably, machine learning, AI, and deep learning, all have their own meaning behind them. Let me explain. Artificial Intelligence, otherwise known as A.I., refers to the broad idea that machines can be trained to perform a task requiring a certain degree of creativity. In other words, it’s the science of creating a machine capable of mimicking human tasks or abilities. Deep learning, and machine learning, are both subsets of A.I., they help recognize patterns, build correlations, and typically attempt to mirror basic ways humans learn. One great example of this is a neural network. Neural networks are a kind of machine learning made up of units connected similarly to the neurons of a human brain. In this way, the computer can learn more abstract ideas such as video games with uncertain information, such as Starcraft, or Dota. Natural Language Processing is what we use to analyze and understand the complexity of languages. If you’ve ever spoken with Siri or Alexa, you have spoken with one of these NLP AIs. Understanding these terms is important in taking your first steps in using artificial intelligence.
Using AI To Rank On Google
As you know, Google’s SERPs are all determined by the RankBrain algorithm, which is an artificial intelligence. By using machine learning, we can reverse-engineer parts of Google’s ranking factors by collecting the key SERPs data and running it through predictive modelling.
STEP 1 | Collect The Data
There are hundreds of ranking factors, and there’s no way we can collect every single one, even if we tried. However, there is a lot of on-page and off-page data we can collect in bulk using a few helpful tools. In this guide, we’ll be using:
Scrape SERP URLs Using Ahref Keyword Research Tool
The first thing you’ll want to do is to gather a list of URLs that are ranking for your targeted keyword. In this case, I simply used the Ahref keyword research tool to return a list of the top 100 SERPs for a given keyword. You can export just one keyword SERP, but the more data you have, the more accurate the predictive modelling will be. So I suggest you export at least 5 related keyword SERPS (~500 results) to work with.
Run List of URLs Through Screaming Frog
Screaming frog is an SEO favourite when it comes to scraping tools, as it gives you loads of useful data about any list of URLs or domains. First, you’ll need to use the “list” function on Screaming Frog to manually upload the list of URLs. Once uploaded, press start, and allow the crawl to run its course. When the crawl is complete click on export, and save the CSV to your computer.
Run the URL List Through BatchSpeed
With Google’s recent announcement on Core Web Vitals, it is a good idea to also test our dataset URLs for those same vitals. Batchspeed.com will allow you to crawl your URL list in bulk, and receive critical webpage performance metrics. The “Core Web Vitals”, as Google calls them, are:
- Largest Contentful Paint, or LCP, is the time it takes for the largest content element on a webpage to become visible.
- First Input Delay, or FID, is the time between the user’s first interaction with the website and its response to that interaction.
- Cumulative Layout Shift, or CLS, measures how often unexpected shifts occur on a webpage.
Using BatchSpeed, you’ll be able to export mobile and desktop performance data that can help build correlations between Core Web Vitals and a URL’s position on Google.
Combine the Data
Once all of the data is collected, it’s time to combine the data into a single sheet, so our machine learning platform can access all of the pertinent information from one place. An easy way to port data across spreadsheet tabs is the VLOOKUP formula, which can help you match the data with the proper URL.
STEP 2 | Download Dataiku



STEP 3 | Create a New Project
To create a new project, simply click the “New Project” button on the top right of the dashboard. 

STEP 4 | Import Your Dataset
Once you reach your new project dashboard, click on the “+ IMPORT YOUR FIRST DATASET” button. 


STEP 5 | Create a Visual Analysis
Continue by clicking on the circular icon on the top left of the toolbar for visual analyses. 


STEP 6 | Create a Machine Learning Model
This is where things get fun. Head over to the “Models” tab on the top right of your page, and then click “create first model”. 





STEP 7 | Analyze Your Results
When the algorithms have finished processing, they’ll return a number of the most important variables according to their calculations. 
STEP 8 | Put Your Data Into Action
Now that you have a clear model of what ranking factors have the most impact on your given keywords, you can now put them into action for your SEO campaigns. By prioritizing your budget, time, and resources toward the metrics that matter most, you can drive organic growth at a faster pace, and at a lower cost. If your results are showing a high “variable importance” for do-follow backlinks, with low variable importance for on-page factors, consider allocating more of your campaign budget towards link-building and vice versa.
Conclusion
I hope you found this guide helpful in getting started with machine learning, and how it can benefit your business or agency. Again, to make the most out of A.I. it’s important to continue learning, and experimenting with more models, algorithms, and datasets to find new ways it can bring value to you or your clients. If you think of some ideas on what you can do with this technology, give it a try. You never know, you might just discover something extraordinary.



