Merge branch 'bs/online-cpus-bsd': The Catch-22 of AI Scraper Protection
As you're reading this, it's likely because the administrator of this website has taken steps to safeguard against aggressive AI companies scraping its content. This is no trivial concern – the scourge of automated scripts threatening websites has become a major issue in recent times.
The solution, dubbed Anubis, uses a Proof-of-Work scheme reminiscent of Hashcash, a proposed system aimed at reducing email spam. On an individual level, this additional load is barely noticeable; however, when applied en masse to scrapers, the cost becomes prohibitively expensive, rendering their efforts futile.
The true intention behind Anubis lies not in stemming the tide of scraper activity but rather as a temporary fix, allowing developers more time to refine the challenge and implement foolproof fingerprinting methods that can identify legitimate users. The ultimate goal is to eliminate the need for these workarounds, providing a seamless user experience.
However, Anubis comes with a caveat – it requires modern JavaScript features, which plugins like JShelter may disable. To overcome this hurdle, users must enable JavaScript on the domain in question. Unfortunately, this presents a catch-22: without disabling these plugins to facilitate the challenge, legitimate users will be unable to access the content.
The situation highlights how the social contract between website hosting and user expectations has shifted with the rise of AI-driven scrapers. While developers strive to balance security with usability, the temporary solution provided by Anubis serves as a reminder that addressing this issue is an ongoing challenge.