Cost Guide

Cheapest AI API

The cheapest API depends on token mix, output length, caching, retries, latency target, and quality floor. Use a calculator before choosing only by headline price.

For chat

Use fast low-cost models for support, summarization, and routing. Keep a stronger fallback for escalations.

For coding

Cheap models save money on simple edits, but higher-quality models can reduce retries and human correction time.

For RAG

Cost is usually dominated by retrieval volume, input context, and embedding refreshes, not only chat completion price.