Updated model price estimator

API Cost Calculator

Compare LLM API pricing before a prototype becomes a surprise invoice. Estimate per-request cost, monthly spend, cached prompt savings, and batch pricing across OpenAI, Claude, Gemini, DeepSeek, Groq, xAI, Mistral, and more.

Providers

Models

109

Latest

Model

Compare with

Input tokens

Prompt, context, retrieved docs

Output tokens

Visible answer plus reasoning

Requests/day

Expected production volume

Estimated cost

$0.1620

per request

Input

$0.0360

Output

$0.1260

Month

$12,150

Input mix

63%

vs GPT-5.4

$11,137.5 more per month

Cheapest blended model

ERNIE Speed

Baidu

Longest context window

GPT-5.5 Pro

OpenAI

Selected model

GPT-5.5 Pro

OpenAI

Compare model pricing

Search the rate card, filter by provider, then click a row to load it into the calculator.

ModelProviderInputOutputContext

Best Models by Use Case

Hand-picked recommendations for common workloads. Click any model name to load it into the calculator.

Best for Coding

Models that excel at code generation, debugging, and refactoring across multiple languages.

Claude Opus 4.8—Top-tier code understanding and generation with extended thinking for complex refactors.

GPT-5.4—Strong general-purpose coding with excellent tool use and instruction following.

Claude Sonnet 4.6—Best price-to-performance ratio for daily coding tasks and agentic workflows.

Budget Pick: DeepSeek V4 Flash

Impressive coding ability at a fraction of the cost of frontier models. ($0.14/$0.28 per 1M tokens)

How to read the bill

API pricing is simple until tools, caching, and reasoning tokens join the party. Input tokens are everything you send: system prompt, chat history, files, retrieved chunks, and tool schemas. Output tokens are what the model generates, including invisible thinking tokens when a provider bills for them.

For production estimates, test with real prompts and real documents. A pricing calculator gives you the shape of the spend; provider usage logs give you the final truth.

Check official pricing for OpenAI

Practical model picks

Cheap classification

Use the lowest blended price that still passes evals. Start with small Flash, Nano, Haiku, or OSS models.

Coding agents

Budget for tool calls and retries. A stronger model can be cheaper if it avoids failed loops.

Long documents

Context window matters more than sticker price once retrieval chunks get large.

Customer-facing chat

Compare monthly spend, not only per-token rates. Output length quietly dominates support bots.

Built With Care

“The fastest way to waste AI budget is to benchmark with tiny prompts and then deploy agents that carry twenty turns of history, tool schemas, and retrieved documents. Run your real prompt shape here before you choose a default model.”

API Cost Calculator & LLM Pricing Comparison

Use this API cost calculator to estimate LLM spend from real token counts, daily request volume, cached prompt discounts, and batch pricing. It is built for developers comparing OpenAI API pricing, Claude pricing, Gemini API costs, DeepSeek pricing, Groq inference, xAI Grok rates, Mistral models, and other language model APIs before shipping to production.

No signup neededRuns offlineClient-side processing

How to Use API Cost Calculator & LLM Pricing Comparison

1Pick the model you plan to use, or search the pricing table and click any model row.
2Enter input tokens for prompts, chat history, retrieved context, files, and tool schemas.
3Enter output tokens for the model response, including reasoning tokens when the provider bills them as output.
4Add daily request volume to estimate monthly spend.
5Use cached input and batch pricing toggles only when your provider and workflow actually support those discounts.

Key Features

Per-request and monthly LLM API cost estimates
Searchable model pricing table with provider filters
Comparison model selector with monthly savings or premium
Cached input and batch pricing toggles
Workload presets for chat, agents, RAG, batch jobs, and classification
Official pricing links for provider verification

Real Ways People Use This

Estimate LLM API costs before launch

Enter realistic input tokens, output tokens, and requests per day to see per-request and monthly API spend before production traffic arrives.

Compare OpenAI, Claude, Gemini, DeepSeek, and Groq

Filter the model pricing table by provider and compare current input/output token rates across major commercial and open-weight inference APIs.

Plan cached prompt and batch savings

Toggle cached input or batch pricing when your workload qualifies, then compare the monthly delta against your standard real-time estimate.

Choose a model for agents and RAG

Use the workload presets to model chat, agent, RAG, batch, and high-volume classification patterns with less hand-waving.

Important Notes

Provider prices change often. Use this as a planning tool, then verify the final number on the official provider pricing page.
Reasoning, tools, web search, image/video, and audio features may add charges that simple text token math does not fully capture.
Batch pricing usually means asynchronous processing. It is useful for offline jobs, not interactive user flows.

Quick Checklist

1Use real prompt and output samples, not guesses.
2Include chat history, retrieval chunks, and tool schemas in input tokens.
3Estimate monthly volume before picking a default model.
4Run a small production log sample through provider usage dashboards after launch.

Questions That Usually Come Up

More Tools You’ll Actually Use

cURL Generator

API Tester

JSON Formatter