Introduction

All studies are easy to manipulate, but SEO studies in particular are especially susceptible to flawed logic. Anyone can look at Google search results and find some patterns and correlations, but a purely observational study can draw dangerously inaccurate conclusions. Most studies in the SEO industry are of this nature, and many are poorly disguised advertisements for AI-SEO tools.
In this article, I will critique four AI-SEO studies related to the ranking potential of AI-generated content. I’ll combine my 20+ years of SEO experience with a potent amount of critical thinking to pick apart what’s worth relying on and what not to place any weight on.
Conclusion (TL;DR)
The “Too-Long, Didn’t Read” version of my findings is:
- All published AI-SEO studies that I could find are flawed.
- Google tells us what’s okay and what’s not okay with regard to publishing AI-generated content.
- Forget what the studies say, just follow Google’s guidelines (and my “Level 3” framework), and you’ll be fine!
Those are the key takeaways, but read on if you want to be able to pick apart a study like a fiery cross-examiner!
High-Level Summary of the Four AI-SEO Studies I Reviewed
Here’s a quick overview of the studies that I reviewed, which will be followed by more in-depth analysis.
Study #1: SEMRush: “Can AI Content Rank on Google?” (2024)1
Their Claim: “Is it safe to use AI for SEO and content creation? The answer is yes. But it’s not that simple.”
What They Did: Analyzed 20,000 articles and surveyed 700+ marketers.
Major Flaws:
- Only analyzed top 20 ranking URLs (survivorship bias – ignored all the AI content that failed)
- Used GPTZero AI detector with no error adjustment
- Didn’t control for domain authority, backlinks, or other variables
- Survey had no credential verification or proof of results
- Published by SEMRush, a company that sells AI-SEO tools (conflict of interest)
Verdict: Deeply flawed, essentially a marketing piece disguised as research.
Study #2: Ahrefs: “AI-Generated Content Does Not Hurt Your Google Rankings” (2024)2
Their Claim: “Google neither punishes nor rewards AI content”
What They Did: Analyzed 600,000 pages from top 20 ranking URLs, surveyed 879 marketers.
Major Flaws:
- Only analyzed top 20 ranking URLs (same survivorship bias)
- Used their own AI detector tool and admitted it’s not perfect
- Misapplied linear correlation (0.011) to claim proof of no relationship
- Contradicted themselves by noting position #1 has less AI than positions 2-20 while also saying that Google doesn’t punish nor reward it
- Published by Ahrefs to promote their own AI detection tool
Verdict: Deeply flawed, another poorly disguised advertisement.
Study #3: Graphite.io: “AI Content in Search & LLMs” (2024-2025)3
Their Claim: Despite making up over 50% of published articles, AI content represents only 14% of Google rankings.
What They Did: Analyzed 31,493 keywords across 10 categories using matched-pair comparison and Wilcoxon Signed-Rank Test.
What They Did Right:
- Avoided survivorship bias by randomly selecting 65,000 URLs from CommonCrawl (both ranking and non-ranking)
- Quantified and adjusted for AI detector error rates (4.2% false positives, 0.6% false negatives)
- Used proper statistical testing (p-value < 0.000001)
- Keyword-by-keyword comparison (apples-to-apples)
What They Still Got Wrong:
- Still used an AI detection tool (but adjusted for error rate)
- Included “moderate confidence” results, not just “high confidence”
- Didn’t control for domain authority and other variables
- Doesn’t address human-edited AI content (hybrid approach)
Verdict: Best of the four, but still imperfect. Has some scientifically solid nuggets.
Study #4: NP Digital: “Performance Engagement Analysis” (2024)4
Their Claim: Human-written content received 5.44x more traffic than AI-generated content over a 5-month period.
What They Did: Tracked 744 articles across 68 websites for 5 months using Google Analytics traffic.
What They Did Right:
- Knew with certainty which content was AI vs. human (they created it themselves – no AI detector needed)
- Measured over time (5 months) with all articles having same publication date
- Used actual traffic data, not just rankings
What They Still Got Wrong:
- Small sample size (744 articles, 68 websites)
- Unclear if they used matched-pair comparison
- Didn’t specify verticals or domain authority comparability
- Limited methodological transparency
- Doesn’t address human-edited AI content (hybrid approach)
Verdict: Better than SEMRush/Ahrefs due to certainty about content source and time-based tracking, but lacks transparency and controls.
The 8 Deadly Flaws of AI-SEO Studies
Flaw #1: Survivorship Bias (Only Analyzing Winners)
What is Survivorship Bias?
Survivorship bias can occur when researchers focus only on successful outcomes, ignoring failures. Research studies about this type of bias have found that “when you look only at survived funds or markets, it can appear that winners repeat, even when they do not” (Brown et al., 1992)5 and that this can result in “overly optimistic interpretations” (Czeisler et al., 2021).6
How Do AI-SEO Studies Fall Victim to Survivorship Bias?
That’s what several of these studies fell victim to. Some of the studies only looked at top-ranked pages, which cannot answer critical questions such as:
- What percentage of AI-generated articles fail to rank at all, or rank very low?
- What if only a tiny fraction of AI-generated articles actually rise to the top?
When AI-SEO studies exclusively hand-pick success stories, they fail to show the whole picture and end up concluding that AI-generated content is “safe” simply because in some cases (unknown how frequently or infrequently), it succeeded.
That’s like studying a financial investment strategy by analyzing only the portfolios of investors who succeeded with that strategy, while completely ignoring the investors who went bankrupt using the same strategy.
Survivorship Bias in These AI-SEO Studies
SEMRush Study: Strong Survivorship Bias
This one fails the Survivorship Bias test. They only analyzed URLs that rank in the top 20 results (first two pages of Google search results). It’s also notable that their article type filtering methodology was not disclosed, but what is clear is that they completely ignored the AI-generated content failures, focusing only on what could be edge cases of what is working.
Ahrefs Study: Strong Survivorship Bias
The Ahrefs study also failed this test. They state that they “took 100,000 random keywords from Ahrefs Keywords Explorer and extracted the top 20 ranking URLs.” Again, how can we know what doesn’t work when you’re only analyzing what did work? This leaves out a critical control group of AI-generated content that didn’t rank well.
Graphite.io Study: Solidly Attempted to Avoid Survivorship Bias
This study aimed to avoid the pitfall of Survivorship Bias by randomly selecting 65,000 URLs from the CommonCrawl archive. This theoretically includes both winners and losers (URLs that rank well and those that don’t); however, it’s worth noting that this archive is not a perfectly randomized sample of all web content. There may be a skew one way or another (likely towards ranking content), but at least this isn’t as blatantly flawed as the other two.
NP Digital Study: Unclear, But Unlikely to Include Strong Survivorship Bias
This study was very different in nature than the others in that it didn’t focus on selecting preexisting URLs. They created content from scratch for this experiment, so it’s unlikely that Survivorship Bias was a factor. However, it is possible that the domains they published the content on, which were not disclosed, had some previous positive ranking history that could have helped the content out.
Flaw #2: Use of Unreliable AI Content Detection Tools
What is an AI Content Detection Tool?
AI content detection tools exist to attempt to identify AI-generated text patterns in written content. They essentially reverse-engineer what an LLM (Large Language Model) might say if it had been the one to generate the text.
LLMs are actually text prediction tools – when you are talking to an AI chatbot, it’s actually predicting the most likely patterns in which the words about the topic should fall. AI content detectors aim to recognize these patterns to help determine if a human wrote the content or an LLM did.
Unreliability of AI Detection Tools
Peer-reviewed research, as well as my own personal firsthand experience, have shown significant reliability problems with AI content detection tools.
- Accountability in Research found “problematic false positive and false negative rates from both free and paid AI detection tools” after testing 100 research articles from 2016-2018 plus 200 AI-generated texts (Popkov et al., 2025).7
- Stanford University research found that AI detectors misclassified over 61% of essays written by non-native English speakers as AI-generated (Liang et al., 2023).8
- A University of Pennsylvania RAID Benchmark study of 10 million AI-generated texts found that many open-source models for detecting AI content use “dangerously high default false positive rates.” Adjusting models’ false-positive rates to “reasonable” levels greatly reduced ability to detect AI content (Callison-Burch, 2024).9
- Turnitin (plagiarism detection platform used by universities) claims they have less than 1% false positive rate10, but admits missing 15% of actual AI content (false negatives).11 Washington Post independent testing of this tool found a 50% false positive rate.12
- WalterWrites AI Detector Analysis found that AI detectors perform at 55% and 97% accuracy. Accuracy drops by 20%+ on paraphrased text. GPTZero was noted as having the “most true false positives.” (WalterWrites AI Detector Analysis, 2025)13
My Experience
Personally, I have experienced fully inverted results using Originality.ai. This screenshot of a “97%” original result was actually almost entirely AI-written. And get this… the one human-written sentence was the one part that got flagged as AI!

Any study that relies on an AI content detection tool should be considered severely flawed.
Use of AI Content Detection in These AI-SEO Studies
SEMRush Study: Includes AI-Content Detection
The SEMRush study fails here by using GPTZero, an AI content detection tool which the WalterWrites AI Detector Analysis (2025) found to have the “most true false positives,” to categorize content as either “AI-generated” or “Human” with no error rate adjustment.
Ahrefs Study: Includes AI-Content Detection
This “study” is actually a poorly disguised advertisement for their own AI content detection tool. The authors even admit to these tools being imperfect, yet proceeded to base all of their “research” conclusions on it.
Graphite.io Study: Included AI-Content Detection, But Did Adjust for Error Rate
This study is innately flawed due to the use of an AI content detection tool, but they did at least attempt to measure and adjust for the inevitable error rate that such tools introduce.
This study established a baseline for human-written content by using articles published between January 2020 and November 2022 (before the launch of ChatGPT). They ran those articles through the SurferSEO AI detector and encountered a false positive rate of 4.2%. They also created a controlled dataset of 6,009 AI-generated articles using GPT-4o and found the detector correctly identified 99.4% as AI (false negative rate of 0.6%). They used a weighted chunk methodology to adjust their results based on these error margins, something that the SEMRush and Ahrefs studies most certainly didn’t do.
However, this isn’t a perfect solution in this case, since the AI-generated content on the web was produced by a wide variety of models, not just GPT-4o. I also take issue with the fact that they chose to include content that the detector rated with “moderate” confidence, instead of just “high” confidence, and it’s notable that Jasper was an AI content production tool active on the market before ChatGPT was launched. So it’s not scientifically solid to conclude that the web was AI-free prior to ChatGPT’s launch.
But again – honorable mention here for attempting to mitigate the inaccuracies of the AI detection tool and being super transparent about how they did so.
NP Digital Study: Avoided AI-Content Detection
Since the NP Digital team generated the content themselves, they had full control and knowledge of which content was AI-generated and which was not. This completely avoids the issues that AI content detection tools bring to AI-SEO study results. They had full control and knowledge of which content was AI-generated and which was not. AI content detection tools and their many flaws were not a factor in this study at all.
Flaw #3: Correlation Does Not Equal Causation
What’s the Difference Between Correlation and Causation?
The human brain is naturally wired toward finding patterns and assuming that one thing caused the other. Sometimes these observed patterns are just coincidences, and sometimes they are driven by an entirely different third factor. Sometimes there is causation involved, but it’s dangerous to assume that A caused B when it could have been B that caused A.
Correlation vs. Causation in SEO
In SEO, there are thousands and thousands of different ranking factors, so looking at two data points and concluding that one caused the other is dangerous. The reality is probably that A wasn’t just caused by B, but also by some unique combination of C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, and Z!
If anyone ever doubts you on the power of this logical fallacy, point them to this website: https://www.tylervigen.com/spurious-correlations. In addition to being thoroughly amusing, this site highlights unexpected correlations that have absolutely no causal relationship among them, such as:
Associate’s degrees awarded in Dental assisting correlate with Wins for the Baltimore Orioles:

The distance between Saturn and the Sun correlates with Baidu’s stock price (BIDU):

And my personal favorite: Per capita cheese consumption correlates with the Number of people who died by becoming tangled in their bedsheets:

Correlation vs. Causation Example
Those are quite funny, but here’s a serious one to exemplify why we should not draw conclusions based upon two factors alone:
- Ice cream sales and drowning deaths are strongly correlated. When ice cream sales go up, drowning deaths increase.
- Does ice cream cause drowning? No. Both are caused by a third factor: warm weather.
- People buy more ice cream when it’s hot, and more people swim (and some of them drown) when it’s hot.
When these studies conclude that AI-generated content will cause a rankings increase, or that it won’t cause a rankings decrease, it is the same as concluding that ice cream causes drowning!
Correlation vs. Causation in These AI-SEO Studies
SEMRush Study: Has Correlation / Causation Issues
The SEMRush study fails here. There were no constant variables at all other than the fact that all of the websites were already performing well in search.
Concluding that AI is safe because these sites use it is like watching Olympic swimmers eat ice cream and concluding that ice cream makes you a fast swimmer. It ignores the real cause of the success: the years of authority-building and trust these sites already established. They didn’t win because of the AI; they won because they were already winners, and the AI was just along for the ride.
Without controlling variables such as domain authority, site age, brand reputation, site technical health, site size, etc., there’s no way to scientifically prove that A caused B. Any comparison is “apples to oranges” unless you normalize for those other factors.
Just because SOME websites with AI-generated content rank well in Google does not mean that YOUR website will. Does your website have the same domain age, popularity, topic/vertical, inbound links, etc. as the websites in these studies? If not, then ignore them!
Ahrefs Study: Has Correlation / Causation Issues
The study focuses on content type but does not assess the impact of the domain’s power. A high-authority site ranking with an AI article might be doing so because of its existing backlink profile and/or brand reputation, not the quality of the AI text itself. If a lesser-known brand created the same exact article, it may not rank as well.
Since a more popular site is likely to be able to “get away with” more transgressions than a less popular site, and since they only analyzed top-ranking URLs, this study is irresponsible to conclude that AI-generated content would be a “safe” move for any site that isn’t amongst the most popular sites on the web.
Graphite.io Study: Has Correlation / Causation Issues
This study noted, but didn’t adjust for, the fact that other factors like domain authority are at play. The element of comparing ranking URLs on a keyword-by-keyword basis is more logically sound than randomly evaluating the amount of AI usage in top-ranking URLs, but they still didn’t control for confounding variables.
NP Digital Study: (Presumably) Has Correlation / Causation Issues
The study did not clarify what verticals or other constants were or were not in the mix. Were the sites and articles all within one industry? Were they all similar in domain authority and age? Many questions are left unanswered in this regard.
Flaw #4: Flawed Survey Data
Relying on poorly conducted surveys to draw scientific conclusions is an irresponsible method of “research” (air quotes intentional).
Without controls for constants such as credential verification, years of experience, etc. all you’re getting is someone’s word. And why should you take their word for it? Surveys where statements are not quantified, and participants may be financially or otherwise motivated to exaggerate their answers, have no place in a research study.
SEMRush Study: Has Flawed Survey Data
SEMRush’s survey of 700+ marketers is indeed problematic.
- What are the credentials of these “marketers”?
- Have they been in the industry for more than a day?
- Have they worked with more than a single website?
- Do they work for AI-content-generation software companies?
- Are they compelled to defend what they’re doing with AI-generated content is working because they really want (or need) that to become true?
- When they report that they are seeing “positive results,” what defines a “positive result”? Is a single additional organic session month-over-month considered a “positive result”?
Ahrefs Study: Has Flawed Survey Data
This study refers to a separate survey of 879 marketers regarding their use of AI in content generation. They point to the fact that survey results stated that 87% of marketers “use” AI. This is stated to seemingly imply that the high rate of usage makes it okay and safe. But it doesn’t clarify what “using” AI means. It’s one thing to “use” AI to check spelling and grammar of human-generated content; it’s another thing entirely to write a whole article using AI.
Graphite.io & NP Digital Studies: Avoided Survey Data
These two studies did not include survey data, which, in my opinion, adds to their credibility. Surveys are fine if they are labeled as such, but to spin a “survey” as a “study” doesn’t sit right with me.
Flaw #5: Conflicts of Interest
Wouldn’t you be skeptical of a study published by a protein powder company saying that protein powder is the best nutritional product in the world? Of course you would.
Yet, somehow, the opposite happens in our industry. SEO software providers are immediately trusted as industry experts. Yes, of course, they do know a lot about the industry; however, when they publish a “study,” their motives should be taken into consideration.
SEMRush Study: Strong Conflict of Interest
SEMRush’s inclusion of a survey of 700+ of their users could be considered a subtle marketing brag, but more problematic are the blatant advertisements for their product that are embedded inside the article. On top of that, they are also running their “study” as an ad on Google, which should set off conflict-of-interest alarm bells.
If an SEO company published a study, always ask yourself – “Is this a research study or an advertisement?”
In this case, it’s quite literally an advertisement on Google Ads.

Ahrefs Study: Strong Conflict of Interest
This one is a very poorly disguised attempt at an advertisement for their AI content detection tool. They don’t even attempt to hide it.
Graphite.io Study: Light Conflict of Interest
The fact that they are an SEO agency introduces the conflict of interest angle by default, but they did not attempt to be salesy at all in their study. They didn’t even include a call to action at the end of it. This demonstrates much more professionalism and neutrality than the SEMRush and Ahrefs studies did.
NP Digital Study: Moderate Conflict of Interest
As an SEO agency and software provider, the authorship of this study also includes a conflict of interest by default, but the alarm bells are louder on this one than the Graphite one. They use this “study” to promote work that they did and even reference how they won an award for it. Can’t blame them, but it does notch down the professionalism of this one when looking at it purely through a research lens and comparing it to how Graphite did theirs.
Flaw #6: Statistical Methodologies
SEMRush Study: No Use of Statistical Tests At All
This study is purely observational. They quantified their observations, but made zero attempt to apply any tests for statistical significance.
Ahrefs Study: Misuse of Linear Correlation
This study aimed to apply a scientific measurement (linear correlation) to “prove” that results are statistically significant. The problem here is that linear correlation is not a good measurement for penalty detection, because Google’s penalties are typically threshold-based, and so many other factors are in the mix.
A small amount of a transgression does not immediately trigger a penalty, and a large amount of domain or brand authority can actually change the threshold. A more popular site can “get away with” more transgressions than a less popular site. The fact that they only analyzed top-ranking URLs essentially means that they only analyzed the domains that can “get away with” just about anything.
This in no way proves that AI-generated content wouldn’t be a negative factor for a site that isn’t already performing strongly. The box plot shown in this study is a view into what happens when your site is already doing great and you sprinkle some AI-generated content into the mix. It does not assess the risk of penalty or demotion for sites that aren’t already fully trusted by Google’s algorithm.
It’s troubling that they try to spin this inaccurate analysis as evidence that “Google neither significantly rewards nor penalizes pages just because they use AI.” In the SEO industry, we know that a small amount of a transgression does not immediately trigger a penalty. We also know that a large amount of domain or brand authority can actually move the threshold of what’s considered a “small” or “large” transgression. If Google’s treatment of AI content involves threshold effects (e.g., penalties are triggered at certain AI potency levels) and especially if other factors can shift those threshold tiers up or down, linear correlation would fail to detect this.
This study also contradicts itself. While at the sme time claiming that Google is completely indifferent to AI-generated content, it also notes that position one tends to have less AI than two through twenty.
Graphite.io Study: Solid Application of Statistical Testing
A matched-pairs statistical analysis, the Wilcoxon Signed-Rank Test, was used to measure the difference in performance between human-written content and AI-generated content for the same keyword. They analyzed 31,493 keywords across 10 categories, and the element of comparing ranking URLs on a keyword-by-keyword basis is more logically sound than randomly evaluating the amount of AI usage in top-ranking URLs.
This is the clearest apples-to-apples methodology I’ve seen to date in an AI-SEO study. The extremely low p-value (< 0.000001) indicates that the ranking difference is not due to chance.
NP Digital Study: Unclear, and Therefore Unlikely That Statistical Testing Was Used
This study quantified their observations but didn’t reference any statistical tests.
Flaw #7: Lack of Time-Based Analysis
One thing we know for sure about SEO is that sometimes things work right away and for some time, and then fall off – either dramatically after an algorithm update or they erode slowly over time.
Any study that only looks at one snapshot in time is limited in its application to reality, and any ranking study that doesn’t take the age of the content into account should be scrutinized heavily as well.
It’s dangerous to rely on snapshot-in-time studies that claim that it is “safe” to use AI to generate website content. All but one (NP Digital) of these four studies failed to look beyond a single moment in time.
Potency of AI-Generated Content by URL vs Domain
Another big miss that all of the studies made is regarding the proportion of AI-generated content on the domain. They all did URL-specific tests, but many of Google’s ranking factors are domain-wide.
Many of the sites that got hit by the March 2024 “unoriginal content” update had a high potency of AI-generated content across their domains. Multiple case studies have documented this, including:
AI-Generated Content Traffic Loss Over Time – Case Studies
- Bonsai Mary Experiment: 95% Traffic Loss After Algorithm Update
In this experiment, SEO expert Jesse Cunningham created a website filled entirely with AI-generated content about houseplants with no human oversight. The website experienced rapid growth starting in December 2023, but saw a 95% decrease in organic traffic after Google’s March 2024 core update (WebFX, 2025).14
- Casual.app Experiment: 99.3% Traffic Loss After Algorithm Update
This experiment used an AI-powered tool (Byword) to mass-produce thousands of AI-generated pages in just six months. Initially, the site saw traffic surges, but lost 99.3% of its traffic after Google’s November 2023 core update focused on E-E-A-T signals (TripleDart, 2025).15
- SE Ranking Experiment: Near-100% Loss Over Three Months
This experiment published 2,000 AI-generated articles across 20 brand-new domains in November 2024. Initial results showed 70.95% indexation rate and 122,000 impressions by December 2024. However, by February 2025, all articles had completely disappeared from search results. (SE Ranking, 2025).16
- Lawn Care Website Case Study: 100% Traffic Decrease After Algorithm Update
In this case, nearly all articles were created with AI tools with no human oversight. After Google’s September 2023 core update addressing content helpfulness, the website saw a continual drop in performance, ultimately experiencing a 100% decrease in organic traffic (WebFX, 2025).17
Not Worth the Risk
With Google consistently tweaking its algorithms to continue to emphasize the importance of originality in content. They’ve demonstrated this already, and it’s entirely possible that they will continue this pattern, becoming more and more strict about AI-generated content over time.
Case-in-Point: The March 2024 Core Update
The March 2024 core update, which aimed to reduce “unoriginal content” by 40% actually ended up doing so by 45% and absolutely decimated some sites that heavily relied upon AI to generate a majority of their sites’ content.
The update introduced new spam policies specifically addressing “scaled content abuse,” defined as creating content “with little effort or originality with no editing or manual curation,” with generative AI mentioned as one example of automated tools used for this purpose (Google Search Central, 2024).18
Independent analysis by Ian Nuttall found that 1.7% of tracked websites (837 sites) were completely deindexed from Google’s search results after this update, representing over 20 million monthly organic visits (Search Engine Journal, 2024).19 Many of these sites heavily relied upon AI to generate a majority of their sites’ content.
For all of my 20+ years in SEO, I’ve been preaching that some risks just aren’t worth taking. What works now may not work tomorrow, so it’s safer to place your investment in “by the books” white hat SEO tactics that will stand the test of time.
Flaw #8: Lack of Clarity On Hybrid Content
The Graphite and NP Digital Studies effectively prove that “raw AI slop” loses to human-generated content, but what about hybrid content?
What Amount of Human Editing Makes AI-Generated Content Acceptable?
Unfortunately, the very nature of hybrid content makes it difficult to study. What amount of human editing tips the scales from “raw AI slop” to “human-edited”? Does inserting one sentence do the trick? Or do you have to edit 80 to 90% of it? I’ve been unable to find any studies to date that effectively provide insight into this.
Some studies choose arbitrary ranges of the AI-generated portion (such as 0-30%, 30-70%, 70-100%), but those studies were on a URL-by-URL basis and involved error-prone AI content detectors. We as marketers need to know a “safe” percentage of AI-generated versus human copy for both a single URL and a whole site, but that information simply isn’t available to us at this time.
I doubt we ever will get clarity on this, either. The very nature of Google’s algorithms is that there are thousands of factors that all impact each other. Some sites are likely to be granted a different tolerance level for the use of AI because of their strength of domain metrics, or their age, or their reputation, or their vertical/topic. Others will be treated with much more scrutiny.
However, we do already have clarity on how Google DOESN’T want you to use AI and how they want you to self-assess your content.
What Google Actually Says About AI-Generated Content
Google tells us quite clearly what’s NOT okay to do with AI-generated content in their Quality Rater Guidelines.
What Are Google’s Quality Rater Guidelines (QRG)?
Google’s Quality Rater Guidelines20 are what human testers of its algorithm use to assess the effectiveness of it. Google constantly tweaks its algorithm to try to satisfy these guidelines. Therefore, common sense says that even if the algorithm doesn’t work precisely in line with these guidelines as of today, this is what they are aiming for and this is what we should do.
In January of 2025, Google updated the Quality Rater Guidelines to address AI-generated content.
Relevant Excerpts from Google’s Updated-for-AI Quality Rater Guidelines
Section: 3.2 Quality of the Main Content
“The quality of the Main Content (MC) is one of the most important considerations for PQ rating. The MC plays a major role in determining how well a page achieves its purpose.
The unifying theme for evaluating the quality of the MC is the extent to which the MC allows the page to achieve its purpose and offers a satisfying user experience. For most pages, the quality of the MC can be determined by the amount of effort, originality, and talent or skill that went into the creation of the content.
- Originality: Consider the extent to which the content offers unique, original content that is not available on other websites. If other websites have similar content, consider whether the page is the original source.
- Talent or Skill: Consider the extent to which the content is created with enough talent and skill to provide a satisfying experience for people who visit the page.
- Accuracy: For informational pages, consider the extent to which the content is factually accurate. For pages on YMYL topics, consider the extent to which the content is accurate and consistent with well-established expert consensus.”
Section 4.6.5: Scaled Content Abuse
“Creating an abundance of content with little effort or originality with no editing or manual curation is often the defining attribute of spammy websites. Scaled content abuse is a spam practice described in the Google Search Web Spam Policies. Scaled content abuse occurs when many pages are generated for the purpose of primarily benefiting the website owner and not helping users. This practice is typically focused on creating large amounts of unoriginal content that provides little to no value for website visitors compared to other similar pages on the web, no matter how it’s created.
Examples of scaled content abuse include:
- Using automated tools (generative AI or otherwise) as a low-effort way to produce many pages that add little-to-no value for website visitors as compared to other pages on the web on the same topic.
Pages and websites made up of content created at scale with no original content or added value for users, should be rated Lowest, no matter how they are created.”
Section 4.6.6: MC Created with Little to No Effort, Little to No Originality, and Little to No Added Value for Website Visitors
“The Lowest rating applies if all or almost all of the MC on the page (including text, images, audio, videos, etc) is copied, paraphrased, embedded, auto or AI generated, or reposted from other sources with little to no effort, little to no originality, and little to no added value for visitors to the website. Such pages should be rated Lowest, even if the page assigns credit for the content to another source.”
Google’s Self-Assessment Questions
The following questions are provided by Google21 to help you assess if what you’re doing with AI is “okay” or not. Well, technically, they are for evaluating whether your content is “helpful” or not, but same difference.
Always evaluate your content drafts according to these questions:
- Does the content provide original information, reporting, research, or analysis?
- Does the content provide insightful analysis or interesting information that is beyond the obvious?
- Does the content provide substantial value when compared to other pages in search results?
- Is the content produced well, or does it appear sloppy or hastily produced?
- Would you expect to see this content in or referenced by a printed magazine, encyclopedia, or book?
- Does the content present information in a way that makes you want to trust it?
- Is this content written or reviewed by an expert or enthusiast who demonstrably knows the topic well?
A Simple and Scientific Method for Measuring Content Originality
At this point, you’re probably thinking, “Okay, I get it – Google wants me to demonstrate originality in my content. But how exactly do I do that? How do I know when I’ve added enough of it?”
Introducing “The Level 3 Content Originality Framework”

As a complete and total nerd, my brain was craving a scientific way to measure content originality in a very similar way to search algorithms. I spent some time studying cognitive linguistics and found the answer. With that, I developed a framework to help our SEO clients clearly understand what is and isn’t appropriate for their use of AI in copywriting.
This greatly simplifies the practice of self-assessing whether or not your content contains enough human input to succeed in SEO, and you can read more about it in this article.
Conclusion: Rely On Common Sense, Not Flawed Studies
AI-SEO studies are innately flawed, to varying degrees. The good ones that use solid scientific methods and avoid bias are worth a read. Apply a healthy dose of skepticism and only hang onto the truly scientifically solid nuggets.
But above all else, just use common sense. Google says not to use AI to write “all or almost all” of your content, to avoid “little to no” originality, and to make an effort to demonstrate your real-world human experience.
It’s really quite simple. So stop reading AI-SEO studies and just do what Google says!
And if you want a scientific way to measure the originality of your content, simply refer to our science-based Level 3 Content Originality Framework.
Bonus Tip! Always Disclose AI Usage
If you use AI to help write portions of your content, you should disclose how and why and for what parts. This is an emphasis in Google’s Search Quality Rater Guidelines.
The idea here is to make it clear what AI did (estimate, summarize, format, suggest), clarify what the human did (authoring, reviewing, analyzing, providing opinions), and add disclaimers about accuracy or intent where needed (e.g., medical, legal, nutritional).
For example, here is my own actual AI Usage Disclamer for this very article!
AI Usage Disclaimer for this Article
The majority of this article was typed with Pam Aungst Cronin’s own human fingers; however, she did lightly incorporate AI to enhance the article in the following ways:
- To perform spelling and grammar checks
- To establish a logical heading/subheading structure
- To find reputable sources to cite in footnotes
- To generate the included analogies and examples
Reference Material for This Article
- https://www.semrush.com/content-hub/can-ai-content-rank-on-google/ ↩︎
- https://ahrefs.com/blog/ai-generated-content-does-not-hurt-your-google-rankings/ ↩︎
- https://graphite.io/five-percent/ai-content-in-search-and-llms ↩︎
- https://neilpatel.com/blog/ai-create-content/ ↩︎
- https://academic.oup.com/book/52483/chapter/421376585 ↩︎
- https://pmc.ncbi.nlm.nih.gov/articles/PMC8207539/ ↩︎
- https://pubmed.ncbi.nlm.nih.gov/38516933/ ↩︎
- https://www.cell.com/patterns/fulltext/S2666-3899(23)00130-7 ↩︎
- https://edscoop.com/ai-detectors-are-easily-fooled-researchers-find/ ↩︎
- https://www.turnitin.com/blog/ai-writing-detection-update-from-turnitins-chief-product-officer ↩︎
- https://gradpilot.com/news/turnitin-failed-admissions-pangram-replacement ↩︎
- https://www.washingtonpost.com/technology/2023/04/01/chatgpt-cheating-detection-turnitin/ ↩︎
- https://walterwrites.ai/are-ai-detectors-accurate/ ↩︎
- https://www.webfx.com/blog/ai/dangers-ai-content/ ↩︎
- https://www.tripledart.com/ai-seo/is-ai-content-bad-for-seo ↩︎
- https://seranking.com/blog/ai-content-experiment/ ↩︎
- https://www.webfx.com/blog/ai/dangers-ai-content/ ↩︎
- https://blog.google/products-and-platforms/products/search/google-search-update-march-2024/ ↩︎
- https://www.searchenginejournal.com/googles-march-2024-core-update-impact-hundreds-of-websites-deindexed/510981/ ↩︎
- https://static.googleusercontent.com/media/guidelines.raterhub.com/en//searchqualityevaluatorguidelines.pdf ↩︎
- https://developers.google.com/search/docs/fundamentals/creating-helpful-content ↩︎
- Why All AI-SEO Studies are Flawed (and What to Trust Instead) - March 12, 2026
- How Much AI-Generated Content is Acceptable for SEO Writing? - February 25, 2026
- How to Spot a “Black Hat” SEO/GEO Scam in 2026 - January 8, 2026



