Ben Welsh has a running list of the news organizations blocking OpenAI crawlers:
In total, 532 of 1,147 news publishers surveyed by the homepages.news archive have instructed OpenAI, Google AI or the non-profit Common Crawl to stop scanning their sites, which amounts to 46.4% of the sample.
The three organizations systematically crawl web sites to gather the information that fuels generative chatbots like OpenAI’s ChatGPT and Google’s Bard. Publishers can request that their content be excluded by opting out via the robots.txt convention.
Which reduces the value of AIs. It used to be the web was open for all, with information you could use as you liked. News organisations often fail to see value in AI but are scared that their jobs will be taken by AIs instead of enhanced. So they try to wreck the AIs, a bit like saboteurs and luddites. A real impediment to growth
Organisational Structures | Technology and Science | Military, IT and Lifestyle consultancy | Social, Broadcast & Cross Media | Flying aircraft