LLMS Central - The Robots.txt for AI
Web Crawling

Publishers say no to AI scrapers, block bots at server level

Theregister.com2 min read
Share:
Publishers say no to AI scrapers, block bots at server level

Original Article Summary

The open web is closing down for unwanted automated traffic A growing number of websites are taking steps to ban AI bot traffic so that their work isn't used as training data and their servers aren't overwhelmed by non-human users. However, some companies are…

Read full article at Theregister.com

Our Analysis

Publishers' decision to block AI scrapers at the server level to prevent their work from being used as training data and to avoid server overload marks a significant shift in the way websites manage automated traffic. This move is likely to impact website owners who rely on open web data for their AI models, as they will need to find alternative sources of training data or risk being blocked by an increasing number of publishers. Website owners who use AI scrapers will need to reassess their data collection strategies to avoid being cut off from valuable information. To adapt to this change, website owners can take several steps: firstly, they should review their llms.txt files to ensure they are not inadvertently blocking legitimate traffic; secondly, they should consider implementing more selective bot-blocking measures to allow desired AI traffic while keeping out unwanted scrapers; and thirdly, they should explore alternative data sources, such as partnerships with publishers or paid data providers, to reduce their reliance on scraped data.

Related Topics

Bots

Track AI Bots on Your Website

See which AI crawlers like ChatGPT, Claude, and Gemini are visiting your site. Get real-time analytics and actionable insights.

Start Tracking Free →