I’m trying to use the Python GData API to upload files to Google Docs. I’ve got OAuth working and can upload most file types like DOC and XLS. But I’m having trouble with PDFs. Even though they’re supposed to be supported, I can’t get them to upload. I’m using the latest GData version but no luck.
I also want to upload PPTX files, but I know they’re not officially supported.
Has anyone managed to upload PDFs to Google Docs with the Python GData API? Any tips or tricks would be really helpful. I’ve been stuck on this for a while and can’t figure out what I’m doing wrong.
Here’s a simple example of what I’ve tried:
def upload_file(file_path):
doc = gdata.docs.data.Resource(type='file', title=os.path.basename(file_path))
media = gdata.data.MediaSource()
media.set_file_handle(file_path, 'application/pdf')
try:
uploaded = client.Upload(doc, media)
print(f'Uploaded: {uploaded.title.text}')
except Exception as e:
print(f'Upload failed: {str(e)}')
upload_file('my_document.pdf')
Any ideas what could be going wrong?
I’ve been down this road before, and it can be frustrating. One thing that worked for me with PDFs was adjusting the chunk size when uploading. Sometimes larger files need to be sent in smaller pieces. Try setting a smaller chunk size, like 262144 bytes, and see if that helps.
For PPTX files, I ended up using a hybrid approach. I’d upload them as-is to Google Drive first using their API, then use GData to import them into Docs. It’s not elegant, but it got the job done.
Also, double-check your OAuth scopes. Make sure you have the correct permissions for file creation and upload. I once spent hours debugging only to realize I was missing a crucial scope.
Lastly, consider logging the full response from the API when uploads fail. Sometimes there’s valuable info in there that doesn’t make it to the exception message.
I’ve encountered similar issues with PDF uploads using the GData API. One workaround I found effective was to use the ‘application/octet-stream’ MIME type instead of ‘application/pdf’. This seems to bypass some of the API’s file type restrictions.
For PPTX files, you might try converting them to PDF first using a library like python-pptx, then uploading the resulting PDF. It’s not ideal, but it can work in a pinch.
Another option worth exploring is switching to the newer Google Drive API. It offers more robust file handling and supports a wider range of formats. The migration might take some effort, but it could solve your upload issues in the long run.
Remember to check your file sizes too. There are upload limits that can cause silent failures if exceeded.
hey there, i’ve had some luck with pdfs using the drive api instead of gdata. it’s a bit more modern and handles different file types better. for pptx, you could try zipping them first before uploading. that sometimes works for unsupported formats. just my 2 cents!