I’m building a Java desktop app that needs to pull down public files from Google Drive programmatically. I discovered that I can use the webContentLink property to grab public files without making users log in.
This approach works great for smaller files:
String downloadUrl = publicFile.getWebContentLink();
InputStream fileStream = new URL(downloadUrl).openStream();
However, I’m running into issues with larger files. When the file size is big, Google Drive shows a virus scan confirmation page instead of serving the file directly through the webContentLink. This breaks my automated download process.
Is there a way to programmatically download large public files from Google Drive without requiring user interaction or authentication? I need this to work completely in the background without any manual steps.
Same headache here! Wrote a backup utility last year and webContentLink crapped out on files around 40MB - Google’s virus scan page kills it every time. Switched to the export endpoint and it’s been solid since. Use https://drive.google.com/uc?id=FILE_ID&export=download instead. Small files get a 302 redirect straight to download. Large files return HTML with a confirmation link - just parse out the confirm code and hit the URL again with that parameter. Usually find the codes in form elements or download links on the page. Been rock solid for me on files up to several gigs.
You can skip the virus scan warning by tweaking the download URL. Add the confirm parameter like this: when Google Drive shows that virus scan page, grab the confirmation token from the URL and stick &confirm=t onto your original webContentLink. Here’s what works better though - build the direct download URL with the file ID: https://drive.google.com/uc?export=download&id=FILE_ID&confirm=t. Just swap FILE_ID with your actual file ID from the API response. I’ve used this for 500MB+ files in production without issues. The trick is catching the redirect response and pulling out any confirmation tokens Google sends in the headers. You might need two requests - first one gets the token, second one uses it.
Had the same issue with my sync tool. The webContentLink throws you to that annoying scan page for files over ~25MB. Fixed it by using the files.get endpoint with alt=media - works on public files without auth. Just build the URL like this: https://www.googleapis.com/drive/v3/files/FILE_ID?alt=media&key=YOUR_API_KEY. You’ll need a free API key but skip all the OAuth stuff.