GPT-5 Model Lineup & Pricing (April 2026)
OpenAI's GPT-5 family spans five tiers, from the ultra-cheap Nano to the reasoning-heavy O3 Pro. Prices dropped roughly 80% across the industry since 2024, but GPT-5 flagship remains premium.
| Model | Input (per 1M) | Output (per 1M) | Context | Best For |
|---|---|---|---|---|
| GPT-5 Nano | $0.05 | $0.40 | 128K | High-volume classification, extraction, simple Q&A |
| GPT-5.4 | $2.50 | $10.00 | 128K | General production use, balanced cost/quality |
| GPT-5.1 | $5.00 | $15.00 | 400K | Long-context tasks, large document analysis |
| GPT-5 | $10.00 | $30.00 | 256K | Complex reasoning, multi-step agents |
| O3 Pro | $150.00 | $600.00 | 200K | Frontier reasoning, research, math proofs |
Prices as of April 2026. Check OpenAI's pricing page for current rates.
GPT-5 vs Competition — Price Comparison
| Model | Input/1M | Output/1M | Context | Strength |
|---|---|---|---|---|
| GPT-5 | $10.00 | $30.00 | 256K | Agents, planning, tool use |
| Claude Opus 4.6 | $5.00 | $25.00 | 1M | Code, long context, 128K output |
| Gemini 2.5 Pro | $1.25 | $10.00 | 1M | Multimodal, large context |
| DeepSeek V3.2 | $0.28 | $0.42 | 128K | Budget bulk, 90% cache discount |
| Grok 3 | $3.00 | $15.00 | 128K | Real-time data, cost-efficient |
GPT-5 is 2x pricier than Claude Opus 4.6 on input and 20% more on output. For pure cost, DeepSeek V3.2 is 36x cheaper on input — but quality and reliability differ significantly.
Monthly Cost Estimates
Assuming 1,000 input tokens + 500 output tokens per request:
| Daily Requests | GPT-5 Nano | GPT-5.4 | GPT-5 | O3 Pro |
|---|---|---|---|---|
| 1,000 | $7.50 | $225 | $750 | $13,500 |
| 10,000 | $75 | $2,250 | $7,500 | $135,000 |
| 100,000 | $750 | $22,500 | $75,000 | $1,350,000 |
How to Cut GPT-5 Costs
1. Prompt Caching (50% off)
OpenAI automatically caches repeated prompt prefixes. If your system prompt stays the same across requests, cached input tokens cost 50% less. For a 2,000-token system prompt called 10,000 times/day, that saves ~$300/month on GPT-5.
2. Batch API (50% off)
Non-real-time workloads (content generation, data extraction, evaluation) can use OpenAI's Batch API for a flat 50% discount on all models. GPT-5 drops from $10/$30 to $5/$15 — matching Claude Opus 4.6's standard pricing.
3. Model Routing
The biggest savings come from not using GPT-5 for everything. Route simple requests to GPT-5 Nano ($0.05) and only escalate to GPT-5 when needed. A typical workload where 80% of requests are simple can save 60-80% of total API spend.
4. Hybrid Cross-Provider Routing
Token Landing takes routing further — blend GPT-5.4 with Claude Sonnet 4.6 or DeepSeek V3.2 through a single OpenAI-compatible endpoint. You keep the same SDK code, but your effective cost drops to $0.80-2.00/M input depending on your quality tier configuration.
GPT-5 Nano — The Budget Breakthrough
At $0.05/M input tokens, GPT-5 Nano is the cheapest model from a major provider (tied with Gemini 2.5 Flash-Lite at $0.10). It handles:
- Text classification and sentiment analysis
- Entity extraction and tagging
- Simple Q&A and FAQ bots
- Content moderation and filtering
- Lightweight summarization
For high-volume pipelines processing millions of requests/day, Nano makes GPT-class quality accessible at costs that were unimaginable in 2024.
When to Use GPT-5 vs Alternatives
- Use GPT-5 for: complex multi-step agents, tool orchestration, tasks where GPT-5.4 measurably underperforms
- Use GPT-5.4 for: general production workloads, chat, content generation — best price/performance ratio in the GPT family
- Use Claude Opus 4.6 for: code generation (top SWE-bench scores), very long documents (1M context), large output generation (128K output)
- Use DeepSeek V3.2 for: budget bulk processing, repeated-context workloads (90% cache discount), cost-sensitive applications
- Use GPT-5 Nano for: classification, extraction, moderation — anything high-volume and low-complexity