Major Brands Enact Blocking Measures Against OpenAI’s GPTBot Web Crawler
OpenAI introduced a new web crawler – GPTBot on 7 August that consumes knowledge for its AI features. For example, ChatGPT and give cues (or queries) AI-generated responses. However, many major brands are looking forward to block GPTBot.
Robots.txt can be used to prevent GPTBot from accessing all or a portion of your website.
In order to disallow GPTBot, add this to your site’s robots.txt.
User-agent: GPTBot
Disallow: /
Further, you can include the GPTBot token in the robots.txt file for your website in the following way to permit GPTBot to access just specific areas of it:
User-agent: GPTBot
Allow: /directory-1/
Disallow: /directory-2/
According to a recent analysis, GPTBot is being blocked by at least 15% of the top 100 websites and 7% of the top 1,000 websites. In fact, week by week the percentage score is rising.
Block ChatGPT VS Not To Block ChatGPT
Many SEOs are confused and questioning whether they should block ChatGPT crawlers or not. It’s obvious that a number of well-known websites have already stopped GPTBot, most likely because they don’t want OpenAI to scrape their data in order to train its algorithms, at least not without payment. Furthermore, ChatGPT does not reference any of its sources.
Amazon, Quora, NYTimes, Shutterstock, CNN, and WikiHow, are some of the highly popular websites that have blocked ChatGPT from accessing their data.
Many websites prohibit GPTBot, however, they do not ban CCbot, Common Crawl’s web crawler, despite the fact that many sites do.
Common Crawl provides some of the training data utilized by OpenAI, Google, and other companies.
The New York Times, which obviously does not want its content used to teach AI systems, is one of a few notable exceptions that prohibit both bots. The popular websites shutterstock.com, reuters.com, and goodhousekeeping.com are among those that prohibit GPTBot and CCBot.
As per a stats report, CCBot has been prohibited by at least 62 of the top 1,000 websites.
For More Such Updates, follow us on our social media platforms.