I’ve been thinking about something that’s been bothering me. Google’s AI models seem to have improved rapidly, and I’m curious about where they get their training data from.
I noticed that Google provides a lot more storage in their AI subscription than other companies. They offer 2TB of Drive space for $20 a month with Gemini, which seems like a lot more than what others are giving.
This makes me wonder if they are using our personal files to enhance their AI. I checked their privacy policy and saw that they process our content for things like spam filtering and searching for files.
But I’m confused - does this content processing mean they use our documents for training their AI models? The language used is quite vague and doesn’t clearly specify whether they do or not.
Has anyone else considered this? Should I reach out to Google directly to inquire? I’m not sure if I’d get a straightforward answer.
i totally get your concern! it does raise eyebrows, right? that extra storage could be a strategy to gather data subtly. checking your settings is a good call. just be cautious with what you share online!
From what I understand about Google’s current policies, they explicitly state that personal files in Google Drive are not used for training their AI models. The content processing mentioned in their privacy policy refers to operational functions like organizing your files, providing search functionality, and security scanning - not AI training data collection. The generous storage offering with Gemini is more likely a competitive strategy to attract users to their AI ecosystem rather than a data harvesting scheme. Other cloud providers have faced similar scrutiny and generally keep personal storage separate from training datasets due to legal and privacy regulations. That said, privacy policies can change, so it’s worth periodically reviewing Google’s terms of service. If you’re still concerned, you could consider using local storage for highly sensitive documents while keeping less critical files in the cloud.
This question comes up frequently in tech circles, and honestly the storage comparison isn’t necessarily indicative of data harvesting. Google’s business model has always been about scale and ecosystem lock-in rather than direct monetization of personal files. The 2TB offering is competitive positioning against Microsoft and others who are also pushing AI-bundled subscriptions. What’s worth noting is that most major tech companies face significant regulatory pressure around data usage transparency, especially after GDPR and similar legislation. The vague language you mentioned is typical legal speak to cover operational necessities, but actual AI training on personal content would likely require explicit consent due to current privacy laws. That said, I’d recommend checking your Google account’s data and privacy settings periodically, as these policies do evolve. The distinction between content processing for service functionality versus training data is crucial, and reputable companies generally maintain that separation to avoid legal complications.