Facing 500 Internal Server Error in Google Cloud Function that retrieves Excel files from SFTP and updates records

I’m encountering a 500 internal server error with my Google Cloud Function. The function is intended to download Excel files from a specified SFTP server, convert those files into JSON format, and then use that data to update records in a database through API requests.

import os
import pysftp
import pandas as pd
import requests
import json

def download_excel_from_sftp(directory, file_list):
    '''
    Establishes a connection to the SFTP server to download Excel files
    '''
    SFTP_HOST = os.environ.get('SFTP_HOST')
    SFTP_USER = os.environ.get('SFTP_USER')
    SFTP_PASS = os.environ.get('SFTP_PASS')
    SFTP_PORT = int(os.environ.get('SFTP_PORT'))
    
    fetched_data = []
    options = pysftp.CnOpts()
    options.hostkeys = None
    
    with pysftp.Connection(SFTP_HOST, username=SFTP_USER, password=SFTP_PASS, port=SFTP_PORT, cnopts=options) as sftp:
        print('Connected to SFTP server')
        
        for excel in file_list:
            file_handle = sftp.open(directory + excel)
            data_frame = pd.read_excel(file_handle)
            fetched_data.append(data_frame)
    
    sftp.close()
    return fetched_data

def transform_data(df):
    '''
    Creates a list of records from the given DataFrame
    '''
    records = [
        {
            'id': 'recXYZ123',
            'fields': {
                'vote_count': int(df.iloc[1,2] + df.iloc[10,2])
            }
        },
        {
            'id': 'recXYZ456',
            'fields': {
                'vote_count': int(df.iloc[3,2] + df.iloc[12,2])
            }
        }
    ]
    return records

def update_records_in_database(records_to_update):
    '''
    Sends a request to update records in the database
    '''
    DATABASE_API_URL = os.environ.get('DATABASE_API_URL')
    DATABASE_API_TOKEN = os.environ.get('DATABASE_API_TOKEN')
    
    headers = {
        "Authorization": f"Bearer {DATABASE_API_TOKEN}",
        "Content-Type": "application/json"
    }
    
    response = requests.patch(DATABASE_API_URL + '/records', headers=headers, data=json.dumps({'records': records_to_update}))
    
    if response.status_code == 200:
        print("Records successfully updated")
    else:
        print("Failed to update records: " + str(response.status_code))

def main(request):
    folder_path = '/data/'
    files_to_process = ['fileA.xlsx', 'fileB.xlsx']
    
    excel_files = download_excel_from_sftp(folder_path, files_to_process)
    transformed_records = transform_data(excel_files[0])
    update_records_in_database(transformed_records)
    
    return 'Processing finished successfully'

The error is logged in Flask, but I’m struggling to identify the source of the 500 error. This function works fine on my local environment but fails once deployed to Google Cloud. Any suggestions on what could be the issue?

Check your cloud function logs - that 500 error’s likely from a missing dependency or timeout. pysftp probably isn’t installing right in the cloud environment. Switch to paramiko instead. Also bump up your function timeout since SFTP + Excel processing takes forever.

Your error handling’s the problem. No try-catch blocks anywhere means any SFTP connection failure, file read issue, or API call will just throw a 500 error.

I’ve hit this exact issue before. Cloud Functions are way stricter on networking and memory than local dev. Wrap each major operation in try-catch and actually log what’s breaking.

That sftp.close() call is pointless too - you’re already using a context manager. Ditch it, it might be messing with your connections.

Also check your environment variables are actually deployed. Can’t tell you how many times I’ve debugged functions where the env vars just weren’t there in production.

Wrap your main function in try-except first and return the actual error message. You’ll know exactly what’s failing instead of guessing.