Docs navigation

Guides

Quickstart Pricing and billing Knowledge Base Sources and documents Docs Studio auto · fast · max Bring Your Own Keys (BYOK)OpenClaw integration NemoClaw integration Local endpoints

API Reference

Authentication Chat completions Image generation Text-to-speech Speech-to-text Video generation Embeddings URL intelligence Models Balance & billing API keys Workspaces Provider keys Usage history Stats Memory Vault API Action guardrails Meta-harness Error codes Rate limits

Docs

Base URL: https://api.tokaroo.com
Tokaroo provides model routing plus Knowledge Base, Sources, Mission Harness, Action Guardrails, telemetry, and billing primitives for AI apps and agents.

Quickstart

Tokaroo is an AI routing gateway that saves you up to 95% on every request — automatically. Connect in under 60 seconds via the SDK or a one-line change to your existing OpenAI code.

Tokaroo is a hosted cloud service. One key, every model, full optimization engine. Bring Your Own Keys (BYOK) is optional.

Tokaroo applies exact-match and semantic caching internally to reduce repeat cost and latency. Shared cache is currently global across Tokaroo. Enterprise isolation controls are planned.

Install the SDK

npm install tokaroo

Make your first request

curl https://api.tokaroo.com/v1/chat/completions \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

Already using OpenAI? One line to swap.

// Before
const openai = new OpenAI({ apiKey: process.env.OPENAI_KEY });

// After — one line change
const openai = new OpenAI({
  baseURL: "https://api.tokaroo.com/v1",
  apiKey:  process.env.TOKAROO_KEY,
});
// Everything else stays the same

Pricing and usage-based billing

Tokaroo is usage-based. Users add credits, send requests through Tokaroo, and spend is deducted as work is performed. There are no seats and no subscription is required for API usage.

Area	What is metered	Where it appears
Routing	Tokens, requests, media, embeddings, latency, cache/fallback behaviour	Usage dashboard and stats APIs
Knowledge Base	Context packs, retrieval, memory writes, events, entities, links, feedback	Knowledge Base dashboard and memory APIs
Sources	Ingestion, document chunking, retrieval, and citations	Sources dashboard and retrieval APIs
Docs Studio	Generated documents, reports, specs, guides, project chat, artifacts	Docs Studio dashboard
Mission Harness	Missions, steps, artifacts, traces, outcomes, action audits	Admin harness and mission APIs
Action Guardrails	Policy checks, approvals, decisions, and audit events	Action APIs and admin views

Customers should see simple balance, usage, and savings. Admins should see provider cost, customer charge, margin, internal usage, shadow-test spend, and reconciliation details.

Modes are still the simplest pricing control: fast is speed/cost optimized, auto optimizes for value and can escalate when needed, and max uses the highest-capability route.

See the public pricing page for the user-facing explanation.

Knowledge Base

Tokaroo Knowledge Base is the context layer for agents and AI apps. It combines durable memory, source-grounded retrieval, entities, graph links, events, and feedback so every request can use the right context without dumping raw history into prompts.

The product word is Knowledge Base. The core engine inside it is Memory Vault. Sources are the files, URLs, docs, specs, repos, emails, and records that become indexed documents and citations.

Layer	What it does	Best for
Memory Vault	Stores facts, preferences, decisions, lessons, entities, links, and events	Long-lived agents and workspace memory
Sources	Converts files, URLs, docs, specs, and records into indexed documents and chunks	Grounded answers with citations
Context packs	Ranks memory, sources, links, and recent events into a budgeted context payload	Agent replies, actions, and mission steps
Feedback	Records what helped, failed, or should be remembered or forgotten	Improving future context selection

Knowledge Base runs on top of Tokaroo routing, so replies still go through auto, fast, or max. Context changes what the model sees; routing still decides how the model call executes.

Storage is scoped to your Tokaroo account and optional workspace. Apps such as Gwen can write memories, events, sources, mission metadata, and feedback back into Tokaroo so future work starts with better context.

Sources and documents

Sources are where knowledge came from. Documents are Tokaroo's indexed representation of a source. Chunks are the searchable pieces, and citations point back to the source/document/chunk that helped an answer.

Term	Meaning	Examples
Source	Original thing Tokaroo can use as context	PDF, URL, Google Doc, email thread, CRM record, repo, transcript
Document	Stored and indexed source representation	Security policy, pricing guide, product spec
Chunk	Retrieval unit embedded and ranked by relevance	A section, paragraph, or page excerpt
Citation	Reference to what context was used	Document title, source URL, chunk score

Use Sources when you need grounded answers from written material. Use Memory Vault when the agent should remember durable facts, preferences, lessons, and relationships.

Docs Studio

Docs Studio is the document and artifact workbench. It can support Tokaroo's own technical docs, but it is also useful for generated work product: reports, specs, guides, proposals, plans, and machine-readable artifacts.

Docs Studio can import sources, build pages/artifacts, and chat against the generated corpus. Agents such as Gwen can use it as a document creation surface while Sources and Knowledge Base provide the underlying retrieval and memory.

auto · fast · max

Every request runs in one of three modes. Set it per request or configure a default in your dashboard.

Mode	Behaviour	Best for
`"auto"`	Best value — Tokaroo picks the right model per request	Most workloads
`"fast"`	Speed-optimized — lowest latency tier	Real-time features, interactive agents
`"max"`	Maximum capability — most powerful models, no shortcuts	Complex reasoning, hardest tasks

These are Tokaroo's routing modes. The underlying provider and model are never exposed — that's the black box.

Tokaroo runs on Tokaroo — our internal systems use the same routing engine for their own AI calls. The code path you use is the code path we trust.

Bring Your Own Keys (BYOK)

BYOK is an advanced, optional feature. Tokaroo works normally without it.

Today BYOK is mainly useful for two things:

Google + Groq free-tier arbitrage — Tokaroo can use your connected keys for eligible BYOK routing, which means you pay Tokaroo only the optimization layer on that traffic instead of full managed rates.
OpenAI + Anthropic baseline personalization — these connections are used to keep savings comparisons honest for max-mode requests and enterprise benchmarking.

Keep BYOK in the background. It is not the core Tokaroo story and most users never need to touch it.

Via the dashboard: go to tokaroo.com/keys → Provider connections → add your key.

Via the API: provider-key management requires a logged-in dashboard session.

curl -X POST https://api.tokaroo.com/v1/provider-keys \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "anthropic",
    "label":    "my-anthropic-key",
    "secret":   "sk-ant-...",
    "workspace_id": "acme-support"
  }'

Secrets are stored encrypted and never returned by any API response.

OpenClaw integration

Recommended: install the Tokaroo provider plugin first. Keep the manual OpenAI-compatible config path as fallback if you prefer to wire it yourself.

1. Install the plugin

openclaw plugins install clawhub:tokaroo-openclaw-provider

2. Set your key

export TOKAROO_API_KEY=tok_...

3. Restart OpenClaw

openclaw restart

The plugin path is the native OpenClaw install flow. If you want the raw provider block instead, use the manual fallback below.

Manual fallback: add to ~/.openclaw/openclaw.json

{
  models: {
    mode: "merge",
    providers: {
      tokaroo: {
        baseUrl: "https://api.tokaroo.com/v1",
        apiKey:  "${TOKAROO_API_KEY}",
        api:     "openai-completions",
        models: [
          { id: "auto", name: "Tokaroo Auto — best value, automatic routing" },
          { id: "fast", name: "Tokaroo Fast — speed-optimized, low latency"  },
          { id: "max",  name: "Tokaroo Max  — highest capability"             },
        ]
      }
    }
  },
  agents: {
    defaults: { model: "tokaroo/auto" }
  }
}

Manual fallback: set your key

export TOKAROO_API_KEY=tok_...

Manual fallback: restart OpenClaw

openclaw restart

That's it. See the full OpenClaw guide for local model hybrid setups.

NemoClaw integration

Add Tokaroo as a provider in your NemoClaw config. Works the same way as any OpenAI-compatible provider.

1. Add to your nemo_config.json

{
  models: {
    mode: "merge",
    providers: {
      tokaroo: {
        baseUrl: "https://api.tokaroo.com/v1",
        apiKey:  "${TOKAROO_API_KEY}",
        api:     "openai-completions",
        models: [
          { id: "auto", name: "Tokaroo Auto — best value, automatic routing" },
          { id: "fast", name: "Tokaroo Fast — speed-optimized, low latency"  },
          { id: "max",  name: "Tokaroo Max  — highest capability"             },
        ]
      }
    }
  },
  agents: {
    defaults: { model: "tokaroo/auto" }
  }
}

2. Set your key

export TOKAROO_API_KEY=tok_...

See the full NemoClaw guide for enterprise and on-prem setups.

Local endpoints

Connect a self-hosted OpenAI-compatible endpoint such as Ollama or vLLM. Tokaroo can prioritize your endpoint and automatically fall back to Tokaroo-optimized managed routing when needed. Requests successfully served by your self-hosted endpoint are charged at 5% of baseline.

For hosted Tokaroo, the endpoint must be reachable from the Tokaroo server. A laptop-local http://localhost:11434 only works if you expose it through a reachable tunnel, VPN, or deployed host.

Tokaroo can also use compatible self-hosted endpoint capabilities where available and fall back to Tokaroo-managed execution automatically. The exact routing and optimization behavior remains part of Tokaroo's managed layer.

Local-endpoint management is a dashboard/session-authenticated surface, not a tok_... runtime API-key surface.

Register an endpoint

curl -X POST https://api.tokaroo.com/v1/local-endpoints \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "url":           "https://ollama.example.internal:11434",
    "label":         "home-ollama",
    "default_model": "llama3.2",
    "headers":       { "Authorization": "Bearer internal-token" },
    "workspace_id":  "acme-support"
  }'

Tokaroo performs a health check and model discovery on registration. If the endpoint is unreachable at registration time it is saved with status unreachable and requests fall back to managed until it responds.

Manage endpoints in the dashboard under Local endpoints, or via GET / PATCH / DELETE /v1/local-endpoints, POST /v1/local-endpoints/:id/recheck, and GET /v1/local-endpoints/:id/models. Listing accepts an optional workspace_id filter.

Supported runtimes: Ollama, vLLM, LocalAI, any OpenAI-compatible server.

API Reference

Authentication

Tokaroo uses two credential types.

Runtime API key

Authorization: Bearer tok_...

Dashboard session

Authorization: Bearer <dashboard-session-token>

Detail
tok_... API keys	Use for runtime requests like chat, images, audio, video, embeddings, balance, model listing, and usage reads within that key's scope.
Dashboard session	Required for billing management, API-key management, workspace management, provider-key management, and local-endpoint management.
Reporting scope	Session auth can inspect account-wide usage or filter by api_key_id / workspace_id. API keys are implicitly scoped to their own key and workspace.
Warning	Never share either credential type — treat them like passwords.

Chat completions

POST /v1/chat/completions

OpenAI-compatible. Supports streaming, tool use, and vision.

Request

interface ChatRequest {
  model:        "auto" | "fast" | "max";
  messages:     { role: "system" | "user" | "assistant"; content: string }[];
  max_tokens?:  number;
  temperature?: number;   // 0–2, default 1
  stream?:      boolean;  // default false
  tools?:       Tool[];
}

Response

interface ChatResponse {
  id:      string;
  object:  "chat.completion";
  created: number;
  model:   "auto" | "fast" | "max";  // echoes the requested routing mode
  choices: {
    index:         number;
    message:       { role: "assistant"; content: string };
    finish_reason: "stop" | "length" | "tool_calls";
  }[];
  usage: {
    prompt_tokens:     number;
    completion_tokens: number;
    total_tokens:      number;
  };
}

Response headers

Header	Value
x-tokaroo-cache	hit or miss
x-tokaroo-cost	USD charged for this request

Examples

curl https://api.tokaroo.com/v1/chat/completions \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

For streaming, set stream: true — the response uses server-sent events (SSE) in the OpenAI format.

Try it

Interactive playgroundRequests go directly from your browser to api.tokaroo.com

Your key is only used in your browser. It is sent only to api.tokaroo.com to make the API call — never to our dashboard servers.

API key

Model

Message

Image generation

POST /v1/images/generations

Generate images with AI. OpenAI-compatible — same request format as DALL-E.

Request

interface ImageRequest {
  prompt:           string;
  model?:           "auto" | "fast" | "max";
  n?:               number;   // number of images, default 1
  size?:            string;   // "1024x1024" | "1024x1792" | "1792x1024"
  quality?:         string;   // "standard" | "hd"
  response_format?: string;   // "url" | "b64_json"
  style?:           string;   // "vivid" | "natural"
}

Response

{
  created: number;
  data: {
    url?:            string;
    b64_json?:       string;
    revised_prompt?: string;
  }[];
}

Examples

curl https://api.tokaroo.com/v1/images/generations \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A sunset over mountains, oil painting style",
    "model": "auto",
    "size": "1024x1024",
    "quality": "hd"
  }'

Text-to-speech

POST /v1/audio/speech

Convert text to spoken audio. Returns raw audio binary — not JSON. Set the response_format to control the audio codec.

Request

interface SpeechRequest {
  input:            string;   // text to speak
  model?:           "auto" | "fast" | "max";
  voice?:           string;   // "alloy" | "echo" | "fable" | "onyx" | "nova" | "shimmer"
  response_format?: string;   // "mp3" | "opus" | "aac" | "flac" | "wav" | "pcm"
  speed?:           number;   // 0.25–4.0, default 1
}

Response

Raw audio bytes with Content-Type header (e.g. audio/mpeg). Save directly to a file or stream to the client.

Examples

curl https://api.tokaroo.com/v1/audio/speech \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Hello, welcome to Tokaroo!",
    "model": "auto",
    "voice": "nova"
  }' \
  --output speech.mp3

Speech-to-text

POST /v1/audio/transcriptions

Transcribe audio to text. Accepts multipart/form-data with an audio file (max 25 MB).

Request

// multipart/form-data fields:
file:             File;     // audio file (mp3, wav, m4a, webm, etc.)
model?:           string;   // "auto" | "fast" | "max"
language?:        string;   // ISO 639-1 code, e.g. "en"
response_format?: string;   // "json" | "text" | "srt" | "vtt"
temperature?:     number;   // 0–1

Response

{
  text: string;
}

Examples

curl https://api.tokaroo.com/v1/audio/transcriptions \
  -H "Authorization: Bearer tok_..." \
  -F file=@recording.mp3 \
  -F model=auto

Video generation

POST /v1/videos/generations

Generate videos from text prompts. Video generation is asynchronous — the POST returns a job ID, then poll with GET until complete.

Submit request

interface VideoRequest {
  prompt:        string;
  model?:        "auto" | "fast" | "max";
  duration?:     number;   // seconds of video, default 5
  size?:         string;
  aspect_ratio?: string;
  fps?:          number;
}

Response (both POST and GET)

{
  id:      string;
  status:  "processing" | "completed" | "failed";
  url?:    string;   // present when status is "completed"
}

Poll for status

GET /v1/videos/generations/:id

Returns the same response shape. Poll until status is "completed" or "failed".

Examples

curl https://api.tokaroo.com/v1/videos/generations \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A timelapse of a flower blooming",
    "model": "auto",
    "duration": 5
  }'

Embeddings

POST /v1/embeddings

Request

interface EmbeddingsRequest {
  input:  string | string[];
  model?: string;   // optional — Tokaroo picks best available
}

Response

Standard OpenAI embeddings shape. Default model: text-embedding-3-small (1536 dims). Tokaroo may use compatible self-hosted endpoint capabilities when available and otherwise continues through Tokaroo-managed execution automatically.

Examples

curl https://api.tokaroo.com/v1/embeddings \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "input": "The quick brown fox"
  }'

URL intelligence

Tokaroo URL intelligence is a deterministic inspection pipeline for redirects, DNS, TLS, RDAP age, parked-domain detection, and brand impersonation. Use it before browsing links, onboarding customer domains, or letting an agent visit external URLs.

POST /v1/url/analyse

Authenticated. Analyses one URL and returns signals, score, decision, and derived observations.

// Request
{
  url: string;
  profile?: string;
  weights?: Record<string, number>;
}

// Response
{
  id: string;
  url: string;
  normalized_url: string;
  domain: string;
  final_url: string | null;
  score: number;
  decision: "allow" | "review" | "warn" | "block";
  signals: { id: string; weight: number; value: string | number | boolean; detail?: string }[];
  observations: {
    final_status_code: number | null;
    redirect_hops: number;
    tls_valid: boolean | null;
    mx_records: number | null;
    ns_records: number | null;
    domain_age_days: number | null;
    expires_in_days: number | null;
    parked: boolean;
    brand_match: object | null;
  };
  cache: { hit: boolean; status: "miss" | "signal_cache"; age_ms: number | null };
}

curl https://api.tokaroo.com/v1/url/analyse \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://paypa1-login.example/security-check",
    "profile": "affiliate-review"
  }'

POST /v1/url/batch

Authenticated. Analyses up to 100 URLs in one request, persists a batch record, and can optionally deliver a completion webhook.

// Request
{
  urls: string[];
  label?: string;
  profile?: string;
  weights?: Record<string, number>;
  webhook_url?: string;
  webhook_secret?: string;
}

// Response
{
  id: string;
  object: "url.batch";
  status: string;
  total_urls: number;
  completed_urls: number;
  failed_urls: number;
  summary: {
    average_score: number;
    cache_hit_rate: number;
    decisions: { allow: number; review: number; warn: number; block: number };
  };
  errors: { input_index: number; url: string; error: string }[];
  results: UrlAnalysisResult[];
  webhook: {
    url: string | null;
    delivered_at: string | null;
    last_error: string | null;
  };
}

curl https://api.tokaroo.com/v1/url/batch \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "label": "affiliate-import-2026-04-13",
    "urls": [
      "https://example.com",
      "https://paypa1-login.example/security-check"
    ],
    "webhook_url": "https://your-app.example/webhooks/tokaroo-url"
  }'

GET /v1/url/batches • GET /v1/url/batches/:id

Authenticated. List recent batches or fetch one persisted batch by id. Useful for polling, dashboards, and webhook retry diagnostics.

GET /v1/url/history • GET /v1/url/stats

Authenticated. Read recent URL checks and aggregate stats including decision breakdown, top signals, top domains, and cache hit rate.

GET /v1/url/profiles • POST /v1/url/profiles • PATCH /v1/url/profiles/:id

Session auth required for create and update. Profiles let you tune signal weights for use cases like affiliate review, inbound lead screening, or agent browsing.

Models

Tokaroo currently tracks 209 models from Anthropic, Google, Groq, and OpenAI. The live table below comes from the public pricing catalog and shows the provider/model universe Tokaroo tracks for routing, comparisons, and research.

Loading models...

Runtime model IDs

GET /v1/models

Authenticated. Returns the Tokaroo model IDs clients can request directly. Tokaroo also accepts many native provider model names as compatibility aliases, but the managed product surface remains these three tiers.

{
  object: "list";
  data: [
    { id: "auto", object: "model", owned_by: "tokaroo", created: 0 },
    { id: "fast", object: "model", owned_by: "tokaroo", created: 0 },
    { id: "max",  object: "model", owned_by: "tokaroo", created: 0 }
  ];
}

curl https://api.tokaroo.com/v1/models \
  -H "Authorization: Bearer tok_..."

Reference pricing catalog

GET /v1/pricing

Public. Returns the upstream provider/model catalog Tokaroo tracks for pricing, capabilities, and context windows. This is Tokaroo's routing universe and reference data, not your actual billed price.

{
  object: "list";
  data: {
    id: string;
    name: string;
    provider: string;
    input_per_1m: number;
    output_per_1m: number;
    capabilities: string[];
    context_window: number;
  }[];
  researched_at: string | null;
}

curl https://api.tokaroo.com/v1/pricing

Your actual billed price is dynamic per request. See x-tokaroo-cost, usage history, and balance deductions for what Tokaroo actually charged.

Balance & billing

GET /v1/balance

interface BalanceResponse {
  balance_usd: number;
  currency: "usd";
}

curl https://api.tokaroo.com/v1/balance \
  -H "Authorization: Bearer tok_..."

POST /v1/billing/checkout

Session auth required. Creates a Stripe Checkout session to pre-load credits. The first activation flow is explicit: accept the activation checkbox in Payments, save a card, enable auto pay for the chosen amount, then complete the payment.

// Request
{ amount_usd: number }   // service minimum applies

// Response
{ url: string; session_id: string }

curl https://api.tokaroo.com/v1/billing/checkout \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{ "amount_usd": 25 }'

POST /v1/billing/add

Session auth required. Charges the saved card directly and adds prepaid balance immediately. If this is the first activation payment and all activation requirements are satisfied, Tokaroo also returns the first API key in the same response.

// Request
{
  amount_usd?: number;
  activate_key_name?: string;
}

// Response
{
  ok: true;
  balance_usd: number;
  activated_key: {
    key: string;
    name: string;
    prefix: string;
  } | null;
}

GET /v1/billing/activation

Session auth required. Returns the current first-key activation checklist state.

{
  key_count: number;
  has_api_key: boolean;
  requires_first_key_activation: boolean;
  balance_usd: number;
  funded: boolean;
  has_saved_card: boolean;
  auto_pay_enabled: boolean;
  auto_pay_amount_usd: number | null;
  activation_terms_accepted: boolean;
  activation_terms_accepted_at: string | null;
  activation_terms_version: string;
  missing_requirements: ("payment" | "saved_card" | "auto_pay" | "terms_acceptance")[];
  ready_for_activation_payment: boolean;
  ready_for_first_key: boolean;
  payments_url: string;
  terms_url: string;
}

POST /v1/billing/activation/accept

Session auth required. Records acceptance of the activation terms for the current account.

// Request
{ accepted: true }

// Response
{
  ok: true;
  activation: ActivationStatus;
}

GET / PUT /v1/billing/autopay

Session auth required. Reads or updates auto pay settings. First-key activation requires auto pay to be enabled. After your first key is active, you can disable auto pay and keep using remaining balance until it is depleted.

// GET response
{
  enabled: boolean;
  amount_usd: number;
}

// PUT request
{
  enabled?: boolean;
  amount_usd?: number;
}

GET / DELETE /v1/billing/card

Session auth required. Reads or removes the saved card used for auto pay and direct top-ups.

// GET response
{
  card: {
    brand: string;
    last4: string;
    exp_month: number;
    exp_year: number;
  } | null;
}

// DELETE response
{ ok: true }

Tokaroo uses a prepaid balance. Managed requests deduct from that balance based on the work actually executed. No subscription. No seat fee. Service minimums may apply at checkout.

First API key activation requires four things: a successful payment, a saved card, auto pay enabled, and accepted terms. Tokaroo records those requirements through the dashboard billing flow and blocks first-key creation until they are all complete. The managed runtime surface remains auto, fast, and max; the larger pricing catalog is Tokaroo's routing universe, not the primary managed selection surface.

BYOK is optional and advanced. When Tokaroo uses your own provider connection, the underlying provider may bill you directly while Tokaroo charges only the optimization layer.

API keys

API-key management requires a dashboard session, not a tok_... runtime API key. Creating the first key requires the full activation checklist: successful payment, saved card, auto pay enabled, and accepted terms.

POST /v1/keys

// Request
{
  name?: string;
  metadata?: Record<string, unknown>;
  workspace_id?: string;
  workspace_name?: string;
}

// Response
{
  id: string;
  key: string;   // "tok_..." — shown once only, save immediately
  name: string;
  prefix: string;
  metadata: Record<string, unknown>;
  workspace_id: string | null;
  workspace_name: string | null;
}

curl -X POST https://api.tokaroo.com/v1/keys \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{ "name": "production", "workspace_id": "acme-support" }'

GET /v1/keys

{
  object: "list";
  data: {
    id: string;
    name: string;
    metadata: Record<string, unknown>;
    workspace_id: string | null;
    workspace_name: string | null;
    keyPrefix: string;
    lastUsedAt: string | null;
    createdAt: string;
  }[];
}
// Secrets are never returned

curl https://api.tokaroo.com/v1/keys \
  -H "Authorization: Bearer <dashboard-session-token>"

DELETE /v1/keys/:id

Returns JSON confirmation. The key is immediately revoked.

{ deleted: true; id: string }

curl -X DELETE https://api.tokaroo.com/v1/keys/:id \
  -H "Authorization: Bearer <dashboard-session-token>"

Workspaces

Workspaces let one Tokaroo account contain named sub-areas such as customers, teams, or environments. API keys, provider keys, local endpoints, usage, and stats can all be bound to a workspace. These routes require a dashboard session.

POST /v1/workspaces

// Request
{
  workspace_id: string;
  name: string;
  metadata?: Record<string, unknown>;
  monthly_budget_usd?: number | null;
}

// Response
{
  id: string;
  workspace_id: string;
  name: string;
  status: string;
  metadata: Record<string, unknown>;
  monthly_budget_usd: number | null;
}

curl -X POST https://api.tokaroo.com/v1/workspaces \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "acme-support",
    "name": "Acme Support",
    "monthly_budget_usd": 250
  }'

GET /v1/workspaces

{
  object: "list";
  data: {
    id: string;
    workspace_id: string;
    name: string;
    status: string;
    metadata: Record<string, unknown>;
    monthly_budget_usd: number | null;
    created_at: string;
    updated_at: string;
  }[];
}

PATCH /v1/workspaces/:workspace_id

// Request
{
  name?: string;
  status?: string;
  metadata?: Record<string, unknown>;
  monthly_budget_usd?: number | null;
}

// Response
{
  id: string;
  workspace_id: string;
  name: string;
  status: string;
  metadata: Record<string, unknown>;
  monthly_budget_usd: number | null;
  updated_at: string;
}

Provider keys

Provider keys are optional and advanced. They are primarily used for BYOK routing, free-tier arbitrage, and max-mode baseline personalization. These routes require a dashboard session.

POST /v1/provider-keys

// Request
{
  provider: "openai" | "anthropic" | "google" | "groq";
  label?: string;
  secret: string;   // stored encrypted, never returned
  workspace_id?: string;
}

// Response
{
  created: true;
  provider: string;
  label: string;
  workspace_id: string | null;
}

curl -X POST https://api.tokaroo.com/v1/provider-keys \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "anthropic",
    "label":    "my-anthropic-key",
    "secret":   "sk-ant-...",
    "workspace_id": "acme-support"
  }'

GET /v1/provider-keys

Optional query param: workspace_id.

{
  object: "list";
  data: {
    id: string;
    provider: string;
    label: string;
    workspace_id: string | null;
    workspace_name: string | null;
    status: string;
    last_success_at: string | null;
    last_failure_at: string | null;
    last_failure_code: string | null;
    estimated_reset_at: string | null;
  }[];
}

DELETE /v1/provider-keys/:id

Returns JSON confirmation.

{ deleted: true; id: string }

Manage provider keys in the dashboard. Secrets are stored encrypted and never returned by any API response.

Usage history

GET /v1/usage

Query params: limit (default 100, max 1000), before (cursor — created_at ISO string), api_key_id, workspace_id, request_type.

interface UsageRow {
  id: string;
  api_key_id: string | null;
  workspace_id: string | null;
  workspace_name: string | null;
  metadata: Record<string, unknown>;
  tier: "auto" | "fast" | "max";
  model: "auto" | "fast" | "max";
  provider: string;
  routing_method: string;
  execution_source: string;
  billing_mode: string;
  request_type: string;
  input_tokens: number;
  output_tokens: number;
  actual_cost_usd: number;
  charged_usd: number;
  baseline_usd: number;
  latency_ms: number | null;
  created_at: string;
}

// Response: { data: UsageRow[], has_more: boolean }

curl "https://api.tokaroo.com/v1/usage?limit=20" \
  -H "Authorization: Bearer tok_..."

Tokaroo never exposes which model or provider handled your request. What you see is what matters: tokens, cost, latency, and tier.

Grouped breakdown

GET /v1/usage/breakdown

Query params: group_by (api_key or workspace), days, api_key_id, workspace_id, request_type.

{
  object: "list";
  group_by: "api_key" | "workspace";
  window_days: number;
  data: Record<string, unknown>[];
}

curl "https://api.tokaroo.com/v1/usage/breakdown?group_by=workspace&days=30" \
  -H "Authorization: Bearer <dashboard-session-token>"

Stats

GET /v1/stats

Summary analytics for the whole account or a filtered workspace/key scope. Query params: days, daily_days, api_key_id, workspace_id, request_type.

{
  window_days: number;
  daily_window_days: number;
  api_key_id: string | null;
  workspace_id: string | null;
  requests: number;
  total_tokens: number;
  avg_latency_ms: number;
  charged_usd: number;
  actual_cost_usd: number;
  tier_breakdown: Record<"auto" | "fast" | "max", number>;
  request_type_breakdown: Record<string, number>;
  daily: {
    date: string;
    requests: number;
    tokens: number;
    charged_usd: number;
    actual_cost_usd: number;
  }[];
}

curl "https://api.tokaroo.com/v1/stats?days=30&workspace_id=acme-support" \
  -H "Authorization: Bearer <dashboard-session-token>"

Memory Vault API

Memory Vault is the structured memory engine inside Knowledge Base. Use it for durable notes, entities, graph links, event history, batch writes, and context packs.

Route	Purpose
GET /v1/memory/notes	List durable memory notes. Filter by workspace, status, kind, and limit.
POST /v1/memory/notes	Create or upsert a memory note with title, content, kind, tags, importance, confidence, metadata, and optional external_source/external_id.
PATCH /v1/memory/notes/:id	Edit note content, status, tags, metadata, importance, confidence, or external identity.
POST /v1/memory/notes/:id/approve	Approve a pending/inferred memory.
DELETE /v1/memory/notes/:id	Forget a memory without destroying audit history.
GET/POST/PATCH /v1/memory/entities	Read and write graph entities such as users, workspaces, projects, tools, companies, and missions.
GET/POST/DELETE /v1/memory/links	Read and write relationships between notes and entities.
GET/POST /v1/memory/events	Record summarized history such as onboarding turns, mission lifecycle events, tool outcomes, and memory-control actions.
POST /v1/memory/batch	Write notes, entities, links, events, and feedback together with stable external IDs.
POST /v1/memory/context-pack	Return budgeted, ranked context with memories, entities, links, events, sources, citations, and telemetry.
POST /v1/memory/context	Compatibility endpoint for simpler context retrieval.

// POST /v1/memory/context-pack
{
  query: string;
  workspace_id?: string;
  max_tokens?: number;
  include_sources?: boolean;
  metadata?: {
    mission_id?: string;
    gwen_task_id?: string;
    tokaroo_mission_uuid?: string;
  };
}

Context-pack telemetry records which memories, sources, and links were included. Feedback can later mark those items helpful or unhelpful so ranking improves over time.

Action guardrails

Action guardrails let agents check risky tool calls before execution. Apps can model actions by app, mission, task, actor, tool, operation, access level, risk level, and requested spend.

Route	Purpose
GET /v1/actions/policies	List allow, approval, or deny policies.
POST /v1/actions/policies	Create or upsert a policy for a tool/action/risk/spend scope.
POST /v1/actions/check	Ask Tokaroo whether an action is allowed, denied, or requires approval before a tool runs.
GET /v1/actions/approvals	List pending/decided approvals for a workspace or mission.
POST /v1/actions/approvals/:id/decide	Approve or reject a pending action approval.
GET /v1/actions/audit	Read action audit events.
POST /v1/actions/audit	Record executed, failed, blocked, approved, or rejected action events.

// POST /v1/actions/check
{
  app: "gwendolyn";
  mission_id?: string; // parent mission/external goal id
  task_id?: string;    // child work-unit id inside the mission
  actor_ref?: string;
  tool_ref: string;
  action_type: "read" | "write" | "spend" | "communication" | "code" | string;
  operation?: string;
  access_level?: "read_only" | "write" | "destructive" | string;
  risk_level?: "low" | "medium" | "high" | "critical";
  requested_amount_usd?: number;
  external_source?: string;
  external_id?: string;
  metadata?: Record<string, unknown>;
}

Meta-harness

Tokaroo records the harness around the model: retrieved documents, retrieved memories, context assembly, model call usage, feedback, and reflections. This is the learning layer that lets agents improve without fine-tuning a model first.

A mission is the parent goal. A task is a child work unit inside that mission. Use mission_id for the stable external mission id, use task_id for a Gwen task or similar unit, and use the returned Tokaroo mission UUID only in the /v1/harness/missions/:id URL path.

Route	Purpose
GET /v1/harness/versions	List active/draft harness configs for context, retrieval, and reflection behavior.
POST /v1/harness/versions	Create a candidate harness config for later evals and rollout.
GET /v1/harness/missions	List agent missions mirrored into Tokaroo with status, progress, app, workspace, and outcome fields.
POST /v1/harness/missions	Create or upsert a mission using stable mission_id or external_source/external_id.
GET /v1/harness/missions/:id	Read mission detail including steps, artifacts, approvals, action audit, usage, and traces.
PATCH /v1/harness/missions/:id	Update mission progress, status, outcome, success, risk, or metadata.
POST /v1/harness/missions/:id/steps	Create or upsert a mission ledger step.
PATCH /v1/harness/missions/:id/steps/:step_id	Update a ledger step.
POST /v1/harness/missions/:id/artifacts	Attach produced outputs such as emails, files, docs, URLs, receipts, plans, reports, or code summaries.
GET /v1/harness/traces	Read the observation stream: inputs, outputs, retrieved context, costs, outcomes.
POST /v1/harness/traces	Record an external observation from an agent/app such as Gwen.
POST /v1/harness/feedback	Attach correction, rating, remember, or forget signals to a trace.
GET /v1/harness/reflections	Read higher-level insights synthesized from traces.
POST /v1/harness/reflections/generate	Generate a reflection from selected traces and optionally store it as memory.

// Mission/task identity model
{
  // POST /v1/harness/missions
  mission_id: "mission_123"; // parent goal
  external_source: "gwendolyn";
  external_id: "mission:mission_123";
}

{
  // POST /v1/harness/missions/:tokaroo_mission_uuid/steps
  step_ref: "task_456"; // or "task_456:step_1"
  task_id: "task_456";
  metadata: {
    mission_id: "mission_123",
    task_id: "task_456"
  }
}

interface HarnessTrace {
  id: string;
  trace_type: "knowledge_reply" | "docs_query" | "external_observation" | string;
  harness_version_id: string | null;
  assistant_id: string | null;
  thread_id: string | null;
  input_summary: string | null;
  output_summary: string | null;
  context_strategy: string;
  context: {
    documents?: Array<{ id: string; score: number }>;
    memories?: Array<{
      id: string;
      score: number;
      relevance_score: number;
      recency_score: number;
      importance_score: number;
      confidence_score: number;
    }>;
  };
  charged_usd: number;
  outcome: "unknown" | "success" | "failure" | string;
  feedback_score: number | null;
}

curl https://api.tokaroo.com/v1/harness/feedback \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "trace_id": "trace_uuid",
    "rating": 1,
    "outcome": "success",
    "remember_text": "User prefers concise answers with concrete next steps."
  }'

Knowledge replies and Docs project chat automatically create harness traces. Gwen and other agents can also write traces directly, then send feedback so Tokaroo can improve retrieval, context assembly, routing, and future memory selection.

Error codes

Errors are JSON objects with an error field. Some routes also include a human-readable message and route-specific metadata.

interface ErrorResponse {
  error: string;
  message?: string;
  payment_url?: string;
  balance_usd?: number;
  required_usd?: number;
}

HTTP	Code	Meaning	Fix
400	varies	Malformed request body, unsupported values, or scope mismatch	Check required fields, filters, and request types.
401	Missing Authorization header / Invalid or expired credential	Missing or invalid credential	Check the Authorization header and credential type.
402	billing_required / payment_failed / no_card	No active balance or a payment step failed	Complete payment setup in the dashboard.
403	Session authentication required ...	Management route requires dashboard session or the requested scope is forbidden	Use a dashboard session or narrow the request scope.
404	Workspace not found / API key not found	Requested resource does not exist in this account	Check the resource id or workspace_id.
429	Rate limit exceeded	Too many requests on this API key	Back off and retry with exponential backoff.
503	Provider ... not configured / No ... provider configured	The required provider path was unavailable	Retry later or use a different route/mode.
500	varies	Something went wrong on our end	Retry with exponential backoff.

The exact error string is not fully normalized across every route yet, so match on HTTP status first and then inspect the error payload where needed.

Rate limits

Limit	Value
Requests per minute (per API key)	Default 60, configurable per key
Dashboard/session traffic	Higher internal allowance; intended for management, not runtime inference

Rate limit headers are returned on every response:

Header	Description
x-ratelimit-limit	Maximum requests per minute
x-ratelimit-remaining	Requests remaining in the current window

When you hit the limit you receive a 429 response with error: "Rate limit exceeded". Back off and retry later.