How To Use Artificial Intelligence To Rank On Google With Dataiku

SEO | ai | Blog
by Linden Schwark
by Linden Schwark
Model Interpretation of variables importance Dofollow domains

Artificial intelligence — the new frontier.

It seems like only yesterday that A.I. had surfaced as the new promise for technological advancement. Google themselves have driven down their energy costs using AI by a huge amount, just by letting machine learning do its thing. Everyone has heard about AlphaGo, AlphaZero, and even AlphaStar taking the mental-sporting field by storm, with ChatGPT causing its own quake in the AI scene. But few truly realize what goes into these deep learning achievements, and feel it’s a big mystery that only Big Tech can solve. So why does it feel like AI is so out of reach for the average person or business? Well, I think it’s because there just isn’t enough accessible information out there. Just like everything else worthwhile, it takes a bit of time, and effort, to learn a new skill. I’m here to help you understand the basics, and how to leverage this technology today in improving your Google rankings with Dataiku.

The A.I. Basics

Although they’re spoken of interchangeably, machine learning, AI, and deep learning, all have their own meaning behind them. Let me explain. Artificial Intelligence, otherwise known as A.I., refers to the broad idea that machines can be trained to perform a task requiring a certain degree of creativity. In other words, it’s the science of creating a machine capable of mimicking human tasks or abilities. Deep learning, and machine learning, are both subsets of A.I., they help recognize patterns, build correlations, and typically attempt to mirror basic ways humans learn. One great example of this is a neural network. Neural networks are a kind of machine learning made up of units connected similarly to the neurons of a human brain. In this way, the computer can learn more abstract ideas such as video games with uncertain information, such as Starcraft, or Dota. Natural Language Processing is what we use to analyze and understand the complexity of languages. If you’ve ever spoken with Siri or Alexa, you have spoken with one of these NLP AIs. Understanding these terms is important in taking your first steps in using artificial intelligence.

Using AI To Rank On Google

As you know, Google’s SERPs are all determined by the RankBrain algorithm, which is an artificial intelligence. By using machine learning, we can reverse-engineer parts of Google’s ranking factors by collecting the key SERPs data and running it through predictive modelling.

STEP 1 | Collect The Data

There are hundreds of ranking factors, and there’s no way we can collect every single one, even if we tried. However, there is a lot of on-page and off-page data we can collect in bulk using a few helpful tools. In this guide, we’ll be using:

Scrape SERP URLs Using Ahref Keyword Research Tool

The first thing you’ll want to do is to gather a list of URLs that are ranking for your targeted keyword. In this case, I simply used the Ahref keyword research tool to return a list of the top 100 SERPs for a given keyword. You can export just one keyword SERP, but the more data you have, the more accurate the predictive modelling will be. So I suggest you export at least 5 related keyword SERPS (~500 results) to work with.

Run List of URLs Through Screaming Frog

Screaming frog is an SEO favourite when it comes to scraping tools, as it gives you loads of useful data about any list of URLs or domains. First, you’ll need to use the “list” function on Screaming Frog to manually upload the list of URLs. Once uploaded, press start, and allow the crawl to run its course. When the crawl is complete click on export, and save the CSV to your computer.

Run the URL List Through BatchSpeed

With Google’s recent announcement on Core Web Vitals, it is a good idea to also test our dataset URLs for those same vitals. will allow you to crawl your URL list in bulk, and receive critical webpage performance metrics. The “Core Web Vitals”, as Google calls them, are:

  • Largest Contentful Paint, or LCP, is the time it takes for the largest content element on a webpage to become visible.
  • First Input Delay, or FID, is the time between the user’s first interaction with the website and its response to that interaction.

Using BatchSpeed, you’ll be able to export mobile and desktop performance data that can help build correlations between Core Web Vitals and a URL’s position on Google.

Combine the Data

Once all of the data is collected, it’s time to combine the data into a single sheet, so our machine learning platform can access all of the pertinent information from one place. An easy way to port data across spreadsheet tabs is the VLOOKUP formula, which can help you match the data with the proper URL.

STEP 2 | Download Dataiku

Dataiku homepage   Dataiku is a free machine learning platform that is beginner-friendly and offers done-for-you algorithms. There is no better platform to start with if you are looking to learn the ropes of artificial intelligence and it’s applications. You’ll need to, first, visit the Dataiku website, where you can then click on the “get started” button. When you land on the pricing page, look for the “free” pricing package and click “install now”. Free edition Now you only need to follow the instructions for your operating system. Select your system It’s important to note that if you are using Windows OS, the installation process will be slightly more complicated. Once you have Dataiku running, you’re ready to start your machine-learning project!

STEP 3 | Create a New Project

To create a new project, simply click the “New Project” button on the top right of the dashboard. Dataiku dashboard When prompted, give your project a name. I chose “Top Ranking Factors”. Dataiku new project

STEP 4 | Import Your Dataset

Once you reach your new project dashboard, click on the “+ IMPORT YOUR FIRST DATASET” button. Import first dataset The button will take you to an upload page, where you’ll want to click “upload your files” under the “Files” category. Dataiku import dashboard You’ll be taken to an upload terminal where you can now import your dataset of ranking factors. Dataiku file upload success Now that your file is uploaded, you can click “create” on the top right of the screen.

STEP 5 | Create a Visual Analysis

Continue by clicking on the circular icon on the top left of the toolbar for visual analyses. Data visualization Now click on “+ New Analysis”, select your dataset with the dropdown, and click “Create Analysis”. New analysis dataset to analyse You will now be presented with the data you’ve imported. Visual analysis dataset If needed, you can make changes like deleting irrelevant data columns, and more.

STEP 6 | Create a Machine Learning Model

This is where things get fun. Head over to the “Models” tab on the top right of your page, and then click “create first model”. Create first model 1 Choose a “Prediction” task. Prediction task Select your target variable, in this case, that’s “Rank”. Also, click “Automated Machine Learning”. Automated machine learning Then, select “High-Performance Models” with the “In-memory” engine. High performance models We can run this model as-is, but we still want to make a couple of changes before we pull the trigger. Head to the “Design” tab. Under “Modeling”, click “Algorithms”. Then make sure you only have “Random Forest” and “XG Boost” algorithms active. You should also change the “Number of trees” to 1000. Model algorithms settings That’s it, we’re ready to go. Click “Train” at the top right, name your training session, and confirm. Training model name Allow the magic of machine learning to take its course. This may take some time.

STEP 7 | Analyze Your Results

When the algorithms have finished processing, they’ll return a number of the most important variables according to their calculations. Algorithm results You can click on the title of the algorithm report to view more detailed information. Model report In the case of the Random Forest session, it has been evaluated that “Dofollow Domains” is the number one variable related to top-ten rankings, and XGBoost agrees with that assessment. Feel free to look around the additional charts and information within the models by clicking on them after they’ve finished calculating their reports.

STEP 8 | Put Your Data Into Action

Now that you have a clear model of what ranking factors have the most impact on your given keywords, you can now put them into action for your SEO campaigns. By prioritizing your budget, time, and resources toward the metrics that matter most, you can drive organic growth at a faster pace, and at a lower cost. If your results are showing a high “variable importance” for do-follow backlinks, with low variable importance for on-page factors, consider allocating more of your campaign budget towards link-building and vice versa.


I hope you found this guide helpful in getting started with machine learning, and how it can benefit your business or agency. Again, to make the most out of A.I. it’s important to continue learning, and experimenting with more models, algorithms, and datasets to find new ways it can bring value to you or your clients. If you think of some ideas on what you can do with this technology, give it a try. You never know, you might just discover something extraordinary.

About the author

Linden Schwark
Linden Schwark is the CEO and Lead SEO Strategist of VSA. For the last 7 years, he has helped business owners reach their goals by dramatically increasing their online presence. In that time, Linden has ranked 100’s of web pages on Google for high competition searches, including his own website

Book a free SEO consultation today with Vancouver SEO Agency

Read similar topics
What is SEO? Search Engine Optimization Fundamentals
What is SEO? Search Engine Optimization Fundamentals

SEO, or Search Engine Optimization, is a powerful tool that can propel your website to the top of search engine rankings and drive organic traffic to your site. As a website owner or digital marketer, it's crucial to understand the basics of SEO to stay ahead of the...

Must-Know SEO Trends and Best Practices in 2023
Must-Know SEO Trends and Best Practices in 2023

SEO, or Search Engine Optimization, refers to the practice of optimizing your website or content in order to rank higher in search engine results pages (SERPs) such as Google. In 2023, SEO will continue to play an important role in digital marketing, and it's crucial...

14 Must-Know Website Design Trends for 2023
14 Must-Know Website Design Trends for 2023

Website design is constantly evolving, and as technology advances, new trends emerge. In 2023, we can expect to see websites that are more user-friendly, visually appealing, and accessible. In this article, we'll discuss the top website design trends for 2023 and what...