Google has quietly updated its privacy policy to explain how it will use public data to help train its AI products. And it makes it clear that it will scrape data from any public-facing website to improve its AI.
The change is buried deep in the privacy policy.
“Google uses information to improve our services and to develop new products, features, and technologies that benefit our users and the public,” the first relevant passage notes. “For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.”
Sign up for our new free newsletter to get three time-saving tips each Friday — and get free copies of Paul Thurrott's Windows 11 and Windows 10 Field Guides (normally $9.99) as a special welcome gift!
"*" indicates required fields
That doesn’t sound like a privacy invasion to me, but in a later clause, the section on “publicly accessible sources” has been modified to account for AI data scraping as well.
“We may collect information that’s publicly available online or from other public sources to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities,” it reads. “Or, if your business’s information appears on a website, we may index and display it on Google services.”
I don’t want to be too alarmist about this. And to give Google some credit, it does maintain a version of the privacy policy that calls out the changes it made in the most recent revision, and most of the changes in the current version, from July 1, are not related to AI. But it’s reasonable to view these changes in the context of Google’s business practices, and this is a company that still makes almost 80 percent of its revenues by harvesting user data and selling it to advertisers. Unattributed data-scraping is inarguably core to this business.