Wikipedia issued a stark warning on Monday to artificial intelligence companies, stating that without the continued contributions of human editors, the very systems that rely on its knowledge could face “model collapse.” The nonprofit said AI firms must stop scraping its content without attribution and instead support its paid API, which ensures sustainable access to the encyclopedia’s vast dataset.
Key Takeaways
- Generative AI systems heavily rely on Wikipedia’s human-curated content.
- Wikimedia’s traffic has declined by 8% while scraping bots are increasing.
- Without fresh human input, AI systems may degrade in accuracy.
- Wikimedia urges AI Developers to use its Enterprise API to ensure sustainability and reduces strain.
Generative AI systems are trained on massive amounts of text, much of it sourced from Wikipedia. Unlike other datasets, Wikipedia is constantly updated by volunteers who debate, verify, and document information across more than 300 languages.
This human oversight is what keeps the encyclopedia accurate and relevant. Without it, AI models risk recycling outdated or incorrect information, leading to degraded performance over time.
The Wikimedia Foundation reports that human page views have dropped by 8% year-over-year, while automated scraping by AI bots has surged. Many of these bots disguise themselves as human users, straining Wikipedia’s servers and bypassing its licensing framework.
The organization has since argued that this trend undermines the sustainability of the open web, as platforms built on volunteer labor are left uncompensated while commercial AI firms profit.
Researchers have warned of “model collapse,” a scenario where AI systems trained primarily on synthetic data begin to lose accuracy and coherence. Wikipedia’s leadership points to this risk as evidence that human editors remain indispensable.
Machines can summarize and synthesize, but they cannot discover new facts or resolve disputes. Without fresh human input, AI risks becoming a closed loop of self-referential content.
To address this, Wikipedia launched the Wikimedia Enterprise API, a paid service designed for large-scale access to content. The API provides structured, reliable data while reducing server strain. The foundation is urging AI companies to adopt this service, attribute Wikipedia content in their outputs, and contribute financially to the platform’s upkeep. So far, major firms like Google have signed agreements, but others continue to rely on scraping.

