Baidu Baike Takes Action Against Googlebot and Bingbot in AI Data Scraping

Friday, 23 August 2024, 23:00

Baidu Baike has blocked Googlebot and Bingbot, a notable move amid the demand for generative AI data. This highlights Baidu's efforts to protect its online assets. The action follows previous restrictions by platforms like Reddit on search engine indexing.
South China Morning Post
Baidu Baike Takes Action Against Googlebot and Bingbot in AI Data Scraping

Baidu Baike's Block on Google and Bing

Chinese internet search giant Baidu has recently implemented a significant change in its robots.txt file that now prevents Googlebot and Bingbot from indexing content on its Baidu Baike service. This decision reflects Baidu's strategic move to enhance control over its valuable data assets, especially as the demand for AI datasets has surged.

Details of the Implementation

  • The change occurred on August 8, as noted by the Wayback Machine records.
  • Previously, Baidu Baike allowed partial indexing, but has now made broader restrictions.

Implications for AI and Data Sharing

This bold shift comes after similar actions by Reddit, which also restricted search engine indexing, except for its partnership with Google. Microsoft, too, has been vocal about safeguarding its data, threatening to withdraw access if competitors misuse the information.

Comparisons with Other Platforms

Notably, China's version of Wikipedia has opened up its entries to search crawlers, further emphasizing the difference in strategy. Despite the blockage, older entries may still be found through cached versions on US search engines.

Market Consequences

The current landscape indicates a thrust by digital platforms to secure their data as generative AI continues to attract interest and investment. These shifts may have lasting impacts on how AI developers access reliable content.


This article was prepared using information from open sources in accordance with the principles of Ethical Policy. The editorial team is not responsible for absolute accuracy, as it relies on data from the sources referenced.


Related posts


Newsletter

Get the most reliable and up-to-date financial news with our curated selections. Subscribe to our newsletter for convenient access and enhance your analytical work effortlessly.

Subscribe