Why not just count characters or words?
Tokenizers split text differently per model family. English averages roughly four characters per token, but code, Chinese, or long URLs can be denser or sparser. APIs standardize on tokens so limits and invoices line up with what the model actually processes.
What shows up on your invoice
Most bills list input tokens and output tokens separately. Hidden context—like long system prompts or retrieved documents—still counts toward context window limits. That is why "we only sent a short question" can still burn a large prompt when RAG is enabled. For a detailed walkthrough of how providers price these tokens, see the AI token pricing guide.
Blended products and honesty
Some vendors blend premium and economy capacity behind one meter. If that applies to you, say so in plain language—see hybrid token lanes and our disclosure.