Per-Request Cost Breakdown
LLM pricing is typically quoted per 1 million tokens, which makes it hard to intuit what a single API call actually costs. This table breaks it down for a typical request: 1,000 input tokens (a moderate system prompt + user message) and 500 output tokens (a paragraph-length response).
| Model | Input Cost | Output Cost | Total Per Request |
|---|---|---|---|
| Mistral Nemo | $0.00002 | $0.00002 | $0.00004 |
| GPT-4o-mini | $0.00015 | $0.00030 | $0.00045 |
| Gemini 2.5 Flash | $0.00015 | $0.00030 | $0.00045 |
| DeepSeek V3 | $0.00028 | $0.00021 | $0.00049 |
| GPT-5-mini | $0.00025 | $0.00100 | $0.00125 |
| Claude Haiku 3.5 | $0.00080 | $0.00200 | $0.00280 |
| Gemini 2.5 Pro | $0.00125 | $0.00500 | $0.00625 |
| GPT-4o | $0.00250 | $0.00500 | $0.00750 |
| Mistral Large | $0.00200 | $0.00300 | $0.00500 |
| Claude Sonnet 4 | $0.00300 | $0.00750 | $0.01050 |
| Claude Opus 4.6 | $0.00500 | $0.01250 | $0.01750 |
| GPT-5 | $0.01000 | $0.01500 | $0.02500 |
| Token Landing Hybrid | ~$0.00080 – $0.00150 | ~$0.00150 – $0.00300 | ~$0.00230 – $0.00450 |
Based on 1,000 input + 500 output tokens per request. Actual costs vary with prompt length. Prices approximate, April 2026.
What These Numbers Mean in Practice
At the budget end, a single GPT-4o-mini call costs less than a twentieth of a cent. At the premium end, a GPT-5 call costs two and a half cents. These sound trivially small, but they add up fast at scale:
- 1,000 requests/day: GPT-4o costs $7.50/day ($225/month). DeepSeek V3 costs $0.49/day ($15/month).
- 10,000 requests/day: GPT-4o costs $75/day ($2,250/month). Claude Sonnet 4 costs $105/day ($3,150/month).
- 100,000 requests/day: Even GPT-4o-mini costs $45/day ($1,350/month). GPT-5 would cost $2,500/day ($75,000/month).
Why Per-Request Thinking Matters
Thinking about cost per request (rather than per million tokens) helps you make better architectural decisions:
- Is this request worth a premium model? A simple classification does not need a $0.025 GPT-5 call when a $0.00045 GPT-4o-mini call works.
- What is the cost of a retry? If a cheap model fails and requires a retry on a premium model, the total cost might exceed just using the premium model once.
- Where are the cost hotspots? That one endpoint doing 50,000 requests/day at $0.0105 each is costing $15,750/month. Route it through hybrid routing and save $6,000-10,000/month.
Optimizing Per-Request Costs
The most effective way to reduce per-request cost is routing each request to the right model tier. Token Landing does this automatically. Simple requests go to budget models ($0.0005/request), medium complexity goes to mid-tier ($0.003-0.006/request), and only complex tasks use premium models ($0.01-0.025/request).
Combined with prompt caching and batch processing, most teams can achieve an effective per-request cost of $0.002-0.005 — premium-quality results at budget prices.