Stanford Study: 35% of New Websites Created by AI by Mid-2025

AI3 min

April 30, 2026

Stanford Study: 35% of New Websites Created by AI by Mid-2025

Researchers at Stanford University found that approximately 35% of new websites were fully or partially generated by AI by mid-2025. The study also revealed a 33% drop in semantic diversity and a 107% increase in positive tone across AI-generated content.

📝

CoinJP Editorial

CoinJP Editorial · 0 articles

Over a Third of New Websites Now AI-Generated

Researchers at Stanford University have determined that roughly 35% of new websites were created entirely or substantially with the help of artificial intelligence by mid-2025. Before the public launch of OpenAI's ChatGPT in November 2022, the figure hovered near zero. In just under three years, AI-generated content has grown to account for more than a third of fresh web publications.

Share of AI-generated websites over time — Share of websites fully generated by AI (red) and created with neural network assistance (purple). Source: GitHub

The team analyzed 33 months of archived web snapshots from the Wayback Machine using the Pangram v3 detector. Their goal was to assess how the surge in AI-written text is reshaping the structure of the World Wide Web.

Why This Matters

The scale of AI content penetration directly impacts the entire information ecosystem. When more than a third of new publications originate from algorithms, the nature of knowledge and opinion that users encounter fundamentally shifts. For the crypto market and broader tech industry, this carries particular weight — a growing share of project analyses, token reviews, and market commentary may already be machine-generated, raising questions about the quality of information driving investment decisions.

Content Is Becoming More Homogeneous and Cheerful

One of the study's central findings is a measurable decline in semantic diversity. AI-generated pages are 33% more similar to one another compared to human-written content. Different websites increasingly restate the same ideas in nearly identical phrasing.

According to the authors, the root cause lies in the architecture of large language models (LLMs) themselves. These systems inherently gravitate toward statistically probable — and therefore "averaged" — responses. The result is templated discourse where the range of unique formulations and unconventional ideas steadily narrows.

The study also documented a shift in emotional tone. AI-generated content proved 107% more positive than human writing. The Stanford team attributed this to a well-documented tendency of LLMs toward sycophancy: during training, developers optimize models to produce pleasant, safe, and socially approved outputs. As a consequence, a significant portion of new websites creates a "sterile-friendly" information environment — one with fewer sharp opinions and genuine debate.

Which Fears Were Not Confirmed

Several widespread concerns failed to find statistical support. The researchers found no significant correlation between the growth of AI content and:

a decline in factual accuracy;
an increase in outright errors;
stylistic flattening of all content into a single template.

Correlation between AI content and tested hypotheses — Left: correlation between AI content volume and tested hypotheses. Right: share of American adults who agree with each hypothesis. Source: GitHub

The Model Collapse Threat

The researchers specifically highlighted the effect known as model collapse — a phenomenon that until recently remained largely theoretical. The mechanism works as follows: when new generations of neural networks are trained on data saturated with AI content, they begin recycling their own averaged outputs. This erodes variability and quality, and threatens a future where LLMs learn not from humans but from the "synthetic echo" of their predecessors.

The research team, in collaboration with the Internet Archive, plans to convert the study into a continuous monitoring system tracking the share of AI-generated content across the web.

Earlier in mid-April, the same Stanford group noted that AI development was outpacing expectations, with neural networks nearly matching human performance on computer-based tasks.

ai-contentartificial-intelligenceinternetllmmodel-collapseresearchstanford

Frequently Asked Questions

What percentage of new websites are created by AI?

According to Stanford University researchers, approximately 35% of new websites were fully or partially created using AI by mid-2025. This figure was near zero before ChatGPT's public launch in November 2022.

How does AI-generated content differ from human-written content?

The Stanford study found that AI-generated pages are 33% more similar to each other than human-written texts, indicating reduced semantic diversity. AI content was also 107% more positive in tone, attributed to LLMs being optimized for pleasant and socially approved responses.

What is model collapse in AI?

Model collapse occurs when new AI models are trained on data that contains a high proportion of AI-generated content. The system begins recycling its own averaged outputs, leading to reduced variability and quality — effectively learning from 'synthetic echoes' rather than human-created content.

Does AI content lead to more factual errors on the web?

The Stanford researchers found no significant correlation between the growth of AI content and a decline in factual accuracy or an increase in errors. These commonly held fears were not supported by statistical evidence in the study.

How was the Stanford AI content study conducted?

The research team analyzed 33 months of archived website snapshots from the Wayback Machine using the Pangram v3 AI detector. The study aimed to measure how the proliferation of AI-written text is transforming the structure of the web.

Read also

AI Audit Uncovers Critical Liveness Bug in Ethereum's Nethermind Client

Octane Security's AI discovered a high-severity vulnerability in the Nethermind execution client that could have halted block production for 38% of Ethereum mainnet validators. The Ethereum Foundation awarded a maximum $50,000 bounty.

3 min·🔥 1

OpenAI Secures Record $110 Billion Round at $730 Billion Valuation

OpenAI closed the largest startup funding round in history at $110 billion, backed by Amazon, SoftBank, and Nvidia, with a $730 billion valuation.

4 min·🔥 1

Trump Orders All Federal Agencies to Drop Anthropic Technologies Within Six Months

Federal agencies have 6 months to drop Anthropic's Claude AI amid ethics clashes. See how xAI and Pentagon deals reshape the landscape.

3 min·🔥 1

Alphabet Posts $94.7B Q1 Revenue Beating Estimates Amid AI-Driven Growth

Google's parent company Alphabet reported Q1 2026 revenue of $94.7 billion, surpassing Wall Street forecasts, with its cloud division and AI integration fueling a strong beat across all metrics.

3 min·🔥 0

DeepSeek Launches V4-Pro: Open-Source Model Outperforms Claude Opus 4.6 and GPT-5.4

Chinese AI startup DeepSeek released a preview of its V4 model family, with the flagship V4-Pro boasting 1.6 trillion parameters and surpassing leading closed-source models in multiple benchmarks.

3 min·🔥 0

Business

Oracle Lays Off Thousands as AI Infrastructure Spending Reshapes Tech Workforce

Oracle has begun mass layoffs affecting thousands of employees worldwide as the company redirects resources toward AI infrastructure. Meanwhile, Block CEO Jack Dorsey envisions AI replacing middle management entirely.

3 min·🔥 0