Complete LLM API Pricing Ranking (April 2026)
LLM API prices dropped roughly 80% across the board since 2024. Here is every major model ranked from cheapest to most expensive as of April 2026.
| Rank | Model | Input (per 1M) | Output (per 1M) | Provider |
|---|---|---|---|---|
| 1 | GPT-5 Nano | $0.05 | $0.40 | OpenAI |
| 2 | Gemini 2.5 Flash-Lite | $0.10 | $0.40 | |
| 3 | Gemini 2.5 Flash | $0.15 | $0.60 | |
| 4 | DeepSeek V3.2 | $0.28 | $0.42 | DeepSeek |
| 5 | Grok 3 Mini | $0.30 | $0.50 | xAI |
| 6 | DeepSeek R1 | $0.55 | $2.19 | DeepSeek |
| 7 | Claude Haiku 4.5 | $0.80 | $4.00 | Anthropic |
| 8 | Gemini 2.5 Pro | $1.25 | $10.00 | |
| 9 | Mistral Large | $2.00 | $6.00 | Mistral |
| 10 | GPT-5.4 | $2.50 | $10.00 | OpenAI |
| 11 | Claude Sonnet 4.6 / Grok 3 | $3.00 | $15.00 | Anthropic / xAI |
| 12 | Claude Opus 4.6 | $5.00 | $25.00 | Anthropic |
| 13 | GPT-5 | $10.00 | $30.00 | OpenAI |
| 14 | O3 Pro | $150.00 | $600.00 | OpenAI |
Prices as of April 2026. DeepSeek V3.2 offers an additional 90% cache discount on repeated contexts.
Budget Tier: Under $1.00 per 1M Input Tokens
GPT-5 Nano at $0.05/$0.40 is the new cheapest model from a major provider — 3x cheaper than GPT-5 Nano was. It handles classification, entity extraction, sentiment analysis, and simple Q&A at near-zero cost.
Gemini 2.5 Flash-Lite at $0.10/$0.40 is Google's budget entry, comparable to Nano in pricing with strong multimodal support.
DeepSeek V3.2 at $0.28/$0.42 remains the best quality-per-dollar in the budget tier, with a 90% cache discount making repeated-context workloads almost free. Grok 3 Mini at $0.30/$0.50 is xAI's budget contender with real-time data access.
Claude Haiku 4.5 at $0.80/$4.00 bridges budget and mid-tier — pricier but delivers noticeably better writing and nuanced reasoning quality.
Mid Tier: $1.00 - $5.00 per 1M Input Tokens
This is where most production workloads live. Gemini 2.5 Pro at $1.25/$10.00 leads on value with a 1M context window. GPT-5.4 at $2.50/$10.00 is the new OpenAI workhorse replacing GPT-5.4. Claude Sonnet 4.6 and Grok 3 both sit at $3.00/$15.00 — Claude leads on code quality (top SWE-bench), Grok on real-time knowledge.
Claude Opus 4.6 at $5.00/$25.00 features a 1M context window and 128K output limit — the strongest option for large code generation in a single response.
Premium Tier: $10.00+ per 1M Input Tokens
GPT-5 at $10.00/$30.00 excels at complex multi-step agents and tool orchestration. O3 Pro at $150/$600 is the frontier reasoning model for research, math proofs, and problems requiring deep multi-step thinking — not for general production use.
How Token Landing Optimizes Across Tiers
Rather than picking a single model, Token Landing's hybrid routing blends tiers automatically. Route 70-80% of traffic through budget models (GPT-5 Nano, DeepSeek V3.2) and reserve premium models for complex requests. Effective rate: $0.50-2.00 input / $2.00-8.00 output per 1M tokens — premium quality where it matters, at a fraction of premium-only pricing.
Zero markup, zero fees (unlike OpenRouter's 5.5% surcharge). One API endpoint, routing handles the rest.