I’m confused about how OpenAI calculates tokens in my API calls. When I send a basic message like “hello” to the GPT-4 model, I expect it to use just 1 token based on what I see in token counting tools. Since GPT-4 has a context limit of 8192 tokens, I figured I could set my max_tokens parameter to 8191 to leave room for that single input token.
But the API keeps telling me my message uses 8 tokens instead of 1. This doesn’t make sense to me. Here’s what I’m sending:
import requests
api_key = "your-api-key-here"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
payload = {
"model": "gpt-4",
"max_tokens": 8191,
"messages": [
{
"role": "user",
"content": "hello"
}
]
}
response = requests.post("https://api.openai.com/v1/chat/completions",
headers=headers,
json=payload)
print(response.json())
The error I get back says my request needs 8199 tokens total (8 for messages plus 8191 for completion) but the limit is 8192. Why does “hello” count as 8 tokens when it should be much less? Am I missing something about how the API counts tokens?