I’m looking for clarification on how token consumption works with the OpenAI Assistants API as opposed to the Chat Completions API.
From what I know, when using the Chat API, it’s necessary to resend the entire history of the chat with each request. This results in being charged for all those tokens every time you do that.
On the other hand, the Assistants API automatically remembers the discussion history, which is quite handy. However, I’m curious about the billing aspects.
Does the token count for the Assistants API include all earlier messages in the conversation when I make a new request? Or am I only charged for the latest message I’m sending? I want to ensure I fully understand the payment structure before engaging in longer chats.