Featured Image

Posted by Jeff_Baker

Grab yourself a cup of coffee (or two) and buckle up, because we’re doing maths today.

Again.

Back it on up…

A quick refresher from last time: I pulled data from 50 keyword-targeted articles written on Brafton’s blog between January and June of 2018.

We used a technique of writing these articles published earlier on Moz that generates some seriously awesome results (we’re talking more than doubling our organic traffic in the last six months, but we will get to that in another publication).

We pulled this data again… Only I updated and reran all the data manually, doubling the dataset. No APIs. My brain is Swiss cheese.

We wanted to see how newly written, original content performs over time, and which factors may have impacted that performance.

Why do this the hard way, dude?

“Why not just pull hundreds (or thousands!) of data points from search results to broaden your dataset?”, you might be thinking. It’s been done successfully quite a few times!

Trust me, I was thinking the same thing while weeping tears into my keyboard.

The answer was simple: I wanted to do something different from the massive aggregate studies. I wanted a level of control over as many potentially influential variables as possible.

By using our own data, the study benefited from:

  • The same root Domain Authority across all content.
  • Similar individual URL link profiles (some laughs on that later).
  • Known original publish dates and without reoptimization efforts or tinkering.
  • Known original keyword targets for each blog (rather than guessing).
  • Known and consistent content depth/quality scores (MarketMuse).
  • Similar content writing techniques for targeting specific keywords for each blog.

You will never eliminate the possibility of misinterpreting correlation as causation. But controlling some of the variables can help.

As Rand once said in a Whiteboard Friday, “Correlation does not imply causation (but it sure is a hint).

Caveat:

What we gained in control, we lost in sample size. A sample size of 96 is much less useful than ten thousand, or a hundred thousand. So look at the data carefully and use discretion when considering the ranking factors you find most likely to be true.

This resource can help gauge the confidence you should put into each Pearson Correlation value. Generally, the stronger the relationship, the smaller sample size needed to be be confident in the results.

So what exactly have you done here?

We have generated hints at what may influence the organic performance of newly created content. No more, and no less. But they are indeed interesting hints and maybe worth further discussion or research.

What have you not done?

We have not published sweeping generalizations about Google’s algorithm. This post should not be read as a definitive guide to Google’s algorithm, nor should you assume that your site will demonstrate the same correlations.

So what should I do with this data?

The best way to read this article, is to observe the potential correlations we observed with our data and consider the possibility of how those correlations may or may not apply to your content and strategy.

I’m hoping that this study takes a new approach to studying individual URLs and stimulates constructive debate and conversation.

Your constructive criticism is welcome, and hopefully pushes these conversations forward!

The stat sheet

So quit jabbering and show me the goods, you say? Alright, let’s start with our stats sheet, formatted like a baseball card, because why not?:

*Note: Only blogs with complete ranking data were used in the study. We threw out blogs with missing data rather than adding arbitrary numbers.

And as always, here is the original data set if you care to reproduce my results.

So now the part you have been waiting for…

The analysis

To start, please use a refresher on the Pearson Correlation Coefficient from my last blog post, or Rand’s.

1. Time and performance

I started with a question: “Do blogs age like a Macallan 18 served up neat on a warm summer Friday afternoon, or like tepid milk on a hot summer Tuesday?

Does the time indexed play a role in how a piece of content performs?

Correlation 1: Time and target keyword position

First we will map the target keyword ranking positions against the number of days its corresponding blog has been indexed. Visually, if there is any correlation we will see some sort of negative or positive linear relationship.

There is a clear negative relationship between the two variables, which means the two variables may be related. But we need to go beyond visuals and use the PCC.

Days live vs. target keyword position

PCC

-.343

Relationship

Moderate

The data shows a moderate relationship between how long a blog has been indexed and the positional ranking of the target keyword.

But before getting carried away, we shouldn’t solely trust one statistical method and call it a day. Let’s take a look at things another way: Let’s compare the average age of articles whose target keywords rank in the top ten against the average age of articles whose target keywords rank outside the top ten.

Average age of articles based on position

Target KW position ≤ 10

144.8 days

Target KW position > 10

84.1 days

Now a story is starting to become clear: Our newly written content takes a significant amount of time to fully mature.

But for the sake of exhausting this hint, let’s look at the data one final way. We will group the data into buckets of target keyword positions, and days indexed, then apply them to a heatmap.

This should show us a clear visual clustering of how articles perform over time.

This chart, quite literally, paints a picture. According to the data, we shouldn’t expect a new article to realize its full potential until at least 100 days, and likely longer. As a blog post ages, it appears to gain more favorable target keyword positioning.

Correlation 2: Time and total ranking keywords on URL

You’ll find that when you write an article it will (hopefully) rank for the keyword you target. But often times it will also rank for other keywords. Some of these are variants of the target keyword, some are tangentially related, and some are purely random noise.

Instinct will tell you that you want your articles to rank for as many keywords as possible (ideally variants and tangentially related keywords).

Predictably, we have found that the relationship between the number of keywords an article ranks for and its estimated monthly organic traffic (per SEMrush) is strong (.447).

We want all of our articles to do things like this:

We want lots of variants each with significant search volume. But, does an article increase the total number of keywords it ranks for over time? Let’s take a look.

Visually this graph looks a little murky due to the existence of two clear outliers on the far right. We will first run the analysis with the outliers, and again without. With the outliers, we observe the following:

Days live vs. total keywords ranking on URL (w/outliers)

PCC

.281

Relationship

Weak/borderline moderate

There appears to be a relationship between the two variables, but it isn’t as strong. Let’s see what happens when we remove those two outliers:

Visually, the relationship looks stronger. Let’s look at the PCC:

Days live vs. total keywords ranking on URL (without outliers)

PCC

.390

Relationship

Moderate/borderline strong

The relationship appears to be much stronger with the two outliers removed.

But again, let’s look at things another way.

Let’s look at the average age of the top 25% of articles and compare them to the average age of the bottom 25% of articles:

Average …

You can read the full article at Moz Blog

http://seopti.com/wp-content/uploads/2013/07/moz-logo.pnghttp://seopti.com/wp-content/uploads/2013/07/moz-logo.pngAdminSEOMOZ
Posted by Jeff_BakerGrab yourself a cup of coffee (or two) and buckle up, because we’re doing maths today. Again. Back it on up...A quick refresher from last time: I pulled data from 50 keyword-targeted articles written on Brafton’s blog between January and June of 2018. We used a technique of writing these...

You might also like to read:

Search Engine Journal

Social Media Surpasses Traditional Newspapers as a Primary News Source [STUDY] by @MattGSouthern

Search Engine Land

SearchCap: Google Search Console report change, Google Shopping expands & SinglePlatform with Moz

Search Engine Journal

The Top 7 Things I’ve Learned About SEO This Year by @Stevenvvessum

Leave a Reply

Your email address will not be published. Required fields are marked *