TokenLanding

Input vs output tokens: two meters on the same call

LLM APIs usually meter prompt tokens and completion tokens at different prices. Learn what counts as each, why output can dominate cost, and how to forecast usage.

2026-04

TL;DR

LLM APIs bill input (prompt) and output (completion) tokens at different rates. Output tokens usually cost 3–5× more and dominate your bill.

Why completions swing costs

A verbose assistant doubles spend even when the question was tiny. Setting max completion length, adopting structured outputs, and trimming reasoning traces in production are levers teams combine with routing and caching.

Tool calls and hidden bytes

Function schemas and intermediate tool results usually bill as input on the next turn—or inline if your SDK bundles them. Surface that in customer-facing docs so no one is surprised at month-end.

Hybrid and blended meters

If your product blends premium-path and economy models under one price list, explain which lane receives which traffic—see hybrid tokens and the disclosure pattern.

FAQ

+What is the difference between input and output tokens?
Input tokens are what you send to the model (prompts, system instructions, context). Output tokens are what the model generates back. Output tokens typically cost 3-5x more than input tokens.
+Why do output tokens cost more than input tokens?
Output tokens require the model to generate new text step by step, which is more computationally expensive than processing input. This is why output tokens typically cost 3-5x more.

Ready to cut your token bill?

Token Landing — hybrid AI tokens, Claude-class UX, saner spend

Related reading