Converting various Google Drive file formats to HTML using Apps Script

I’m working on a project to change different file types in my Google Drive folder into HTML format. I want to use Google Apps Script to do this. The files I’m dealing with are Word docs, Excel sheets, PDFs, and Google’s own Docs and Sheets.

My end goal is to put these HTML files into a Salesforce Knowledge Base. But I’m running into a problem. When I try to run my script, I get an error message instead of the HTML files I want. The error says something about a missing ‘mimeType’ parameter.

Here’s a simplified version of what I’m trying to do:

function convertToHtml() {
  var sourceBox = DriveApp.getFolderById('source123');
  var targetBox = DriveApp.getFolderById('target456');
  var fileList = sourceBox.getFiles();
  
  while(fileList.hasNext()) {
    var currentFile = fileList.next();
    var htmlVersion = makeHtmlCopy(currentFile.getId());
    Logger.log(htmlVersion);
  }
}

function makeHtmlCopy(fileId) {
  var apiEndpoint = 'https://drive.google.com/uc?export=download&id=' + fileId;
  var options = {
    method: 'get',
    headers: {'Authorization': 'Bearer ' + ScriptApp.getOAuthToken()},
    contentType: 'text/html',
    muteHttpExceptions: true
  };
  var htmlContent = UrlFetchApp.fetch(apiEndpoint, options).getContentText();
  var newFile = DriveApp.createFile('converted_file.html', htmlContent, 'text/html');
  return newFile.getUrl();
}

Can anyone help me figure out what I’m doing wrong? How can I fix this ‘mimeType’ issue?

I’ve encountered similar challenges when working with Google Drive file conversions. The ‘mimeType’ error you’re seeing likely stems from the API endpoint you’re using. Instead of relying on the download URL, I’d recommend leveraging the Google Drive API v3 for more robust file handling.

Here’s a suggestion: modify your ‘makeHtmlCopy’ function to use the Drive API’s export method. This approach allows you to specify the desired output format explicitly. You’ll need to enable the Drive API in your project and adjust your code accordingly.

Additionally, ensure you’re handling different file types appropriately. For instance, Google Docs might require a different export process compared to PDFs or Excel files. Consider implementing separate conversion logic for each file type to maximize compatibility and reliability.

I’ve tackled similar conversion tasks before, and your approach is on the right track. However, the ‘mimeType’ error suggests a hiccup in how you’re handling file types. Instead of using a generic download URL, try leveraging the Google Drive API’s export method for Google Docs, Sheets, and Slides. For other file types like PDFs, you might need to use the Files: get method with the alt=media parameter.

Here’s a tip from my experience: create a mapping object for different file types to their corresponding MIME types for export. This will allow you to dynamically set the correct export format based on the file’s original type. Also, make sure you account for API quotas and file size limits as these have tripped me up before.

Lastly, consider implementing proper error handling and retries in your script. Dealing with multiple files can lead to occasional network issues or API glitches, and a retry mechanism can help avoid disruptions.

hey mike, i’ve dealt with similar stuff. ur code looks ok but the issue might be with the api endpoint. try using the google drive api v3 instead. it’s more reliable for file conversions. also, check if u have the necessary scopes enabled in ur script. good luck!