Hi everyone, I’m trying to update Google Docs using the Drive API v2. Previously, I used the outdated Documents List API to export a document as HTML, edit it, and then reupload it as a new or modified document. This approach was great for creating PDFs from a template, but I haven’t been able to replicate it with the Drive API.
I have some code that fetches the HTML content of a document, makes some changes, and then reuploads it. However, it simply adds an HTML file to Drive rather than converting it into a proper Google Doc.
from io import StringIO
from googleapiclient.http import MediaIoBaseUpload
drive_service = create_drive_service()
file_info = drive_service.files().get(fileId='YOUR_FILE_ID').execute()
html_url = file_info['exportLinks']['text/html']
response, html_data = download_content(html_url)
modified_html = html_data.replace('example_text', 'new_text')
new_file_metadata = {
'title': 'Updated Document',
'mimeType': 'text/html'
}
new_media = MediaIoBaseUpload(StringIO(modified_html), mimetype='text/html', resumable=False)
drive_service.files().insert(body=new_file_metadata, media_body=new_media).execute()
How can I ensure that the file is rendered as a Google Doc rather than being stored as an HTML file? Also, any advice on handling resumable uploads on App Engine would be much appreciated as I’m running into errors.
I’ve encountered similar challenges when working with the Drive API. In my experience, the optimal solution involves combining the strengths of the Drive API with those of the Docs API. This approach allows for managing document metadata through the Drive API while performing content modifications using the Docs API, which helps preserve the formatting and structure of the document. Direct conversion from an HTML file to a Google Doc is not supported by the Drive API alone, so a two-step process is necessary. On App Engine, leveraging task queues to handle large uploads has proven effective in mitigating issues related to file size and timeout constraints.
Working with Google’s APIs has taught me that combining the strengths of the Drive API with the Docs API can simplify the process while offering more control over document updates. In my experience, it’s more effective to use the Drive API to locate the document and then apply modifications via the Docs API. This method preserves the document’s formatting without relying on cumbersome HTML conversions. I also found that when dealing with App Engine, bypassing its size and timeout limits by uploading first to Google Cloud Storage can prevent many issues. This strategy has proven both reliable and efficient in my projects.
hey isaac, i’ve dealt with this before. the drive API doesn’t directly convert HTML to google docs. what worked for me was uploading the HTML file first, then using files.update to change the mimetype to ‘application/vnd.google-apps.document’. it’s not perfect but gets the job done. for app engine, try using task queues to handle big uploads - that solved my timeout issues.