Request
Response
"include_routing": true to see which model was chosen and why.
The four variants
| Model ID | Optimizes for | Best when… |
|---|---|---|
auto | Quality + speed balance | You want the best model without thinking about it |
auto-fast | Lowest latency | Real-time apps, chatbots, low-latency pipelines |
auto-cheap | Lowest cost | High-volume jobs, batch processing, cost-sensitive apps |
auto-quality | Highest quality | Critical decisions, best possible output |
How task detection works
NinjaChat analyzes your last 3 user messages to detect the task type:| Task type | Detected keywords | auto routes to |
|---|---|---|
code | function, debug, implement, algorithm, TypeScript, SQL… | claude-sonnet-4.6 |
math | equation, solve, calculate, integral, probability… | o3-mini |
creative | write, story, poem, imagine, fiction, lyrics… | gemini-3.1-pro |
analysis | analyze, compare, evaluate, research, summarize… | gpt-5 |
quick | Short prompts under 80 chars, “what is”, “define”… | gemini-3-flash |
general | Everything else | gpt-5 |
Full routing table
- auto (balanced)
- auto-fast
- auto-cheap
- auto-quality
| Task | Model |
|---|---|
| code | claude-sonnet-4.6 |
| math | o3-mini |
| creative | gemini-3.1-pro |
| analysis | gpt-5 |
| quick | gemini-3-flash |
| general | gpt-5 |
Billing
Auto variants are billed at the resolved model’s rate. Ifauto routes to o3-mini, you pay 0.015. The routing field always shows the cost-incurring model.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
include_routing | boolean | false | Include routing object in response. |
budget_cents | number | — | Override with a cost ceiling. See Budget Routing. |