Skip to main content

Request

curl -X POST https://ninjachat.ai/api/v1/chat \
  -H "Authorization: Bearer nj_sk_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role": "user", "content": "Solve: x² + 5x + 6 = 0"}],
    "include_routing": true
  }'

Response

{
  "model": "o3-mini",
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Factoring: x² + 5x + 6 = (x + 2)(x + 3) = 0\nSolutions: x = -2 and x = -3"
    },
    "finish_reason": "stop"
  }],
  "routing": {
    "requested": "auto",
    "resolved": "o3-mini",
    "task_type": "math",
    "reasoning": "Detected task type: math"
  },
  "cost": {"this_request": "$0.006"}
}
Add "include_routing": true to see which model was chosen and why.

The four variants

Model IDOptimizes forBest when…
autoQuality + speed balanceYou want the best model without thinking about it
auto-fastLowest latencyReal-time apps, chatbots, low-latency pipelines
auto-cheapLowest costHigh-volume jobs, batch processing, cost-sensitive apps
auto-qualityHighest qualityCritical decisions, best possible output

How task detection works

NinjaChat analyzes your last 3 user messages to detect the task type:
Task typeDetected keywordsauto routes to
codefunction, debug, implement, algorithm, TypeScript, SQL…claude-sonnet-4.6
mathequation, solve, calculate, integral, probability…o3-mini
creativewrite, story, poem, imagine, fiction, lyrics…gemini-3.1-pro
analysisanalyze, compare, evaluate, research, summarize…gpt-5
quickShort prompts under 80 chars, “what is”, “define”…gemini-3-flash
generalEverything elsegpt-5

Full routing table

TaskModel
codeclaude-sonnet-4.6
matho3-mini
creativegemini-3.1-pro
analysisgpt-5
quickgemini-3-flash
generalgpt-5

Billing

Auto variants are billed at the resolved model’s rate. If auto routes to o3-mini, you pay 0.006.Ifitroutestoclaudesonnet4.6,youpay0.006. If it routes to `claude-sonnet-4.6`, you pay 0.015. The routing field always shows the cost-incurring model.

Parameters

ParameterTypeDefaultDescription
include_routingbooleanfalseInclude routing object in response.
budget_centsnumberOverride with a cost ceiling. See Budget Routing.

Code examples

import requests, os

r = requests.post("https://ninjachat.ai/api/v1/chat",
    headers={"Authorization": f"Bearer {os.environ['NINJACHAT_API_KEY']}"},
    json={
        "model": "auto",
        "messages": [{"role": "user", "content": "Write a merge sort in Python"}],
        "include_routing": True,
    }
)
data = r.json()
print(data["choices"][0]["message"]["content"])
print("Routed to:", data["routing"]["resolved"])  # claude-sonnet-4.6
print("Task type:", data["routing"]["task_type"]) # code

Manual model selection

If you prefer explicit control over which model runs, here’s a quick reference:
models = {
    "classify": "deepseek-v3",       # $0.003 — simple tasks
    "chat":     "gpt-5",             # $0.006 — general use
    "code":     "claude-sonnet-4.6",  # $0.015 — when quality matters
    "fast":     "gemini-2.5-flash",   # $0.003 — lowest latency
}
See Models for the full list with pricing and recommendations.