Free LLM Monthly Cost Estimator — Three Numbers, Any Model

If you're sizing API spend for a product, three numbers matter: how many requests per month, how big is the average input prompt, how big is the average output response. freeaicostcalculator.app takes those three numbers, runs them against any model in its curated catalog or the full OpenRouter catalog of 370+ models, and shows monthly + yearly spend per model. Sortable, copy-as-Markdown, exportable as CSV.

How the estimation works

The math is simple: per-request cost = (input tokens × input rate + output tokens × output rate) / 1,000,000, then monthly cost = per-request × requests-per-month. The calculator applies this for every selected model and renders the result as a sorted horizontal bar chart so you can immediately see the cheapest and most expensive options for your workload.

Getting realistic input numbers

Most cost estimates fail because the input numbers were guesses. Use freetokencounter.app on a representative prompt to count actual tokens. Then use freeprompttester.app to run the prompt against a few models and see what the typical output length looks like. Plug those numbers — not your guesses — into freeaicostcalculator.app and you'll get a realistic monthly forecast within ~10% of actual.

Common workload presets

Side project / prototype: 1,000 req/mo, 300 input tokens, 150 output tokens. Most models cost under $5/mo at this volume.
Indie SaaS: 50,000 req/mo, 800 input tokens, 300 output tokens. Cheapest models around $20-50/mo; frontier models $200-500/mo.
Production: 1,000,000 req/mo, 1,500 input tokens, 500 output tokens. Cheapest models $200-500/mo; frontier models $5,000+/mo.
RAG / long-context: 10,000 req/mo, 5,000 input tokens, 800 output tokens. Input-heavy — prompt caching matters most here.

The calculator has these as one-click preset buttons in the workload panel.

Try freeaicostcalculator.app — Free, No Sign-Up

Workload-driven. 370+ models. Flat-plan break-even check. Pure arithmetic in your browser.

Open AI Cost Calculator →

Frequently Asked Questions

How accurate is the monthly cost estimate?

Accurate within ~10% if your input numbers (req/mo, avg in tokens, avg out tokens) reflect reality. Inputs are usually the source of error, not the formula.

Why does the estimate change so much when I tweak max output?

Output tokens are typically priced 3-5× higher than input tokens. Even small changes in average output length dominate the monthly bill — that's why output is the lever to optimize when costs run high.

What if my workload varies day-to-day?

Use peak-day numbers × 30 for monthly to get an upper bound, or a typical-day average × 30 for a midpoint. The calculator is fast enough to run both quickly.

Can I model multiple endpoints (e.g., chatbot + summarizer + classifier)?

Run the calculator three times with each endpoint's workload, copy results as Markdown, sum. The Pro tier (saved named workloads + monthly forecasting) is on the roadmap.

Does it include images, audio, or other modalities?

Not in v1. Text input/output tokens only. Multimodal pricing varies wildly by provider; model those separately.

What if I use a model not in the catalog?

The OpenRouter tab loads the entire OpenRouter catalog (370+ models) on page load with live pricing. If your model is on OpenRouter, it's in the picker. Otherwise, models from non-OpenRouter providers are added manually — request via GitHub if a major one is missing.