Publishers say no to AI scrapers, block bots at server level

Original Article Summary
The open web is closing down for unwanted automated traffic A growing number of websites are taking steps to ban AI bot traffic so that their work isn't used as training data and their servers aren't overwhelmed by non-human users. However, some companies are…
Read full article at Theregister.com✨Our Analysis
Publishers' decision to block AI scrapers at the server level to prevent their work from being used as training data and to avoid server overload marks a significant shift in the way websites manage automated traffic. This move is likely to impact website owners who rely on open web data for their AI models, as they will need to find alternative sources of training data or risk being blocked by an increasing number of publishers. Website owners who use AI scrapers will need to reassess their data collection strategies to avoid being cut off from valuable information. To adapt to this change, website owners can take several steps: firstly, they should review their llms.txt files to ensure they are not inadvertently blocking legitimate traffic; secondly, they should consider implementing more selective bot-blocking measures to allow desired AI traffic while keeping out unwanted scrapers; and thirdly, they should explore alternative data sources, such as partnerships with publishers or paid data providers, to reduce their reliance on scraped data.
Related Topics
Track AI Bots on Your Website
See which AI crawlers like ChatGPT, Claude, and Gemini are visiting your site. Get real-time analytics and actionable insights.
Start Tracking Free →
![A leader’s journey through profound grief and loss [PODCAST]](/_next/image?url=https%3A%2F%2Fkevinmd.com%2Fwp-content%2Fuploads%2FThe-Podcast-by-KevinMD-WideScreen-3000-px-4-scaled.jpg&w=3840&q=75)