How to compute OpenAI API costs when dealing with cached tokens

I’m working with OpenAI’s API and getting confused about how to handle cached tokens in my cost calculations. When I get the usage data back from the API, I see separate values for regular input tokens and cached input tokens.

Here’s what I’m dealing with:

  • Regular input tokens: 1204
  • Cached input tokens: 1024
  • Output tokens: 12

The pricing structure is:

  • Regular input: $0.150 per 1M tokens
  • Cached input: $0.075 per 1M tokens
  • Output: $0.600 per 1M tokens

My question is whether the input token count already excludes the cached ones, or if I need to manually subtract the cached amount from the total input count before doing my cost math. I thought maybe the input value would automatically be reduced to 180 (1204 minus 1024), but I’m not sure if that’s how it works.

Has anyone figured out the right way to calculate costs when you have both regular and cached input tokens?

Based on my experience with the OpenAI API billing, those token counts represent distinct categories that should be calculated separately without any subtraction. The regular input tokens (1204) and cached input tokens (1024) are already properly segregated by OpenAI’s system. So your calculation would be: (1204 × $0.150/1M) + (1024 × $0.075/1M) + (12 × $0.600/1M). The API response gives you the exact breakdown needed for accurate cost computation. I made the same assumption initially about needing to subtract cached tokens from the total, but that’s not how their billing works. Each token type gets charged at its respective rate independently.

exactly, the inputs are separate, so no need to subtract them. just calculate each cost independently: 1204 tokens at $0.150/M, 1024 at $0.075/M, and 12 at $0.600/M. openai does the segregating for you, so just sum them up!