I’m encountering issues when trying to create a DataFrame from JSON responses received from the OpenAI API, which provides CEO details for prominent companies.
company_data = pd.DataFrame({
'Position': [1, 2, 3],
'Corporation': ['Facebook', 'Tesla', 'Netflix'],
'Sales': ['$117,929', '$40,540', '$29,699'],
'Margin': ['30%', '11.3%', '19.2%'],
'Holdings': ['$42,171', '$24,524', '$15,039'],
'Valuation': ['22%', '14%', '18%']
})
company_data.head()
However, when I process this data to gather CEO information through the API, I find that some records are incomplete while others return the correct data. If I run the same requests independently, they yield complete information.
result_df = pd.DataFrame()
data_fields = ["Corporation", "Country", "Field", "CEO Name", "Undergraduate Degree", "Institution", "MBA", "Graduate School"]
for idx, record in company_data.iterrows():
corp_name = record['Corporation']
query_text = f"""Fetch details for {corp_name} following this JSON schema: {data_fields}"""
response = openai_client.chat.completions.create(
model="gpt-3.5-turbo-0125",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": "You are an assistant that returns JSON formatted data."},
{"role": "user", "content": query_text}
]
)
api_response = json.loads(response.choices[0].message.content)
data_row = pd.DataFrame({
"Corporation": [api_response.get("Corporation", "")],
"Country": [api_response.get("Country", "")],
"Field": [api_response.get("Field", "")],
"CEO Name": [api_response.get("CEO Name", "")],
"Undergraduate Degree": [api_response.get("Undergraduate Degree", "")],
"Institution": [api_response.get("Institution", "")],
"MBA": [api_response.get("MBA", "")],
"Graduate School": [api_response.get("Graduate School", "")]
})
result_df = pd.concat([result_df, data_row], ignore_index=True)
What might be causing this inconsistency while looping through the records?