Docs navigation

Docs

Base URL: https://api.tokaroo.com
Tokaroo provides model routing plus Knowledge Base, Sources, Mission Harness, Action Guardrails, telemetry, and billing primitives for AI apps and agents.

Quickstart

Tokaroo is an AI routing gateway that saves you up to 95% on every request — automatically. Connect in under 60 seconds via the SDK or a one-line change to your existing OpenAI code.

Tokaroo is a hosted cloud service. One key, every model, full optimization engine. Bring Your Own Keys (BYOK) is optional.

Tokaroo applies exact-match and semantic caching internally to reduce repeat cost and latency. Shared cache is currently global across Tokaroo. Enterprise isolation controls are planned.

Install the SDK

npm install tokaroo

Make your first request

curl https://api.tokaroo.com/v1/chat/completions \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

Already using OpenAI? One line to swap.

// Before
const openai = new OpenAI({ apiKey: process.env.OPENAI_KEY });

// After — one line change
const openai = new OpenAI({
  baseURL: "https://api.tokaroo.com/v1",
  apiKey:  process.env.TOKAROO_KEY,
});
// Everything else stays the same

Pricing and usage-based billing

Tokaroo is usage-based. Users add credits, send requests through Tokaroo, and spend is deducted as work is performed. There are no seats and no subscription is required for API usage.

AreaWhat is meteredWhere it appears
RoutingTokens, requests, media, embeddings, latency, cache/fallback behaviourUsage dashboard and stats APIs
Knowledge BaseContext packs, retrieval, memory writes, events, entities, links, feedbackKnowledge Base dashboard and memory APIs
SourcesIngestion, document chunking, retrieval, and citationsSources dashboard and retrieval APIs
Docs StudioGenerated documents, reports, specs, guides, project chat, artifactsDocs Studio dashboard
Mission HarnessMissions, steps, artifacts, traces, outcomes, action auditsAdmin harness and mission APIs
Action GuardrailsPolicy checks, approvals, decisions, and audit eventsAction APIs and admin views

Customers should see simple balance, usage, and savings. Admins should see provider cost, customer charge, margin, internal usage, shadow-test spend, and reconciliation details.

Modes are still the simplest pricing control: fast is speed/cost optimized, auto optimizes for value and can escalate when needed, and max uses the highest-capability route.

See the public pricing page for the user-facing explanation.

Knowledge Base

Tokaroo Knowledge Base is the context layer for agents and AI apps. It combines durable memory, source-grounded retrieval, entities, graph links, events, and feedback so every request can use the right context without dumping raw history into prompts.

The product word is Knowledge Base. The core engine inside it is Memory Vault. Sources are the files, URLs, docs, specs, repos, emails, and records that become indexed documents and citations.

LayerWhat it doesBest for
Memory VaultStores facts, preferences, decisions, lessons, entities, links, and eventsLong-lived agents and workspace memory
SourcesConverts files, URLs, docs, specs, and records into indexed documents and chunksGrounded answers with citations
Context packsRanks memory, sources, links, and recent events into a budgeted context payloadAgent replies, actions, and mission steps
FeedbackRecords what helped, failed, or should be remembered or forgottenImproving future context selection

Knowledge Base runs on top of Tokaroo routing, so replies still go through auto, fast, or max. Context changes what the model sees; routing still decides how the model call executes.

Storage is scoped to your Tokaroo account and optional workspace. Apps such as Gwen can write memories, events, sources, mission metadata, and feedback back into Tokaroo so future work starts with better context.

Sources and documents

Sources are where knowledge came from. Documents are Tokaroo's indexed representation of a source. Chunks are the searchable pieces, and citations point back to the source/document/chunk that helped an answer.

TermMeaningExamples
SourceOriginal thing Tokaroo can use as contextPDF, URL, Google Doc, email thread, CRM record, repo, transcript
DocumentStored and indexed source representationSecurity policy, pricing guide, product spec
ChunkRetrieval unit embedded and ranked by relevanceA section, paragraph, or page excerpt
CitationReference to what context was usedDocument title, source URL, chunk score

Use Sources when you need grounded answers from written material. Use Memory Vault when the agent should remember durable facts, preferences, lessons, and relationships.

Docs Studio

Docs Studio is the document and artifact workbench. It can support Tokaroo's own technical docs, but it is also useful for generated work product: reports, specs, guides, proposals, plans, and machine-readable artifacts.

Docs Studio can import sources, build pages/artifacts, and chat against the generated corpus. Agents such as Gwen can use it as a document creation surface while Sources and Knowledge Base provide the underlying retrieval and memory.

auto · fast · max

Every request runs in one of three modes. Set it per request or configure a default in your dashboard.

ModeBehaviourBest for
"auto"Best value — Tokaroo picks the right model per requestMost workloads
"fast"Speed-optimized — lowest latency tierReal-time features, interactive agents
"max"Maximum capability — most powerful models, no shortcutsComplex reasoning, hardest tasks

These are Tokaroo's routing modes. The underlying provider and model are never exposed — that's the black box.

Tokaroo runs on Tokaroo — our internal systems use the same routing engine for their own AI calls. The code path you use is the code path we trust.

Bring Your Own Keys (BYOK)

BYOK is an advanced, optional feature. Tokaroo works normally without it.

Today BYOK is mainly useful for two things:

  • Google + Groq free-tier arbitrage — Tokaroo can use your connected keys for eligible BYOK routing, which means you pay Tokaroo only the optimization layer on that traffic instead of full managed rates.
  • OpenAI + Anthropic baseline personalization — these connections are used to keep savings comparisons honest for max-mode requests and enterprise benchmarking.

Keep BYOK in the background. It is not the core Tokaroo story and most users never need to touch it.

Via the dashboard: go to tokaroo.com/keys → Provider connections → add your key.

Via the API: provider-key management requires a logged-in dashboard session.

curl -X POST https://api.tokaroo.com/v1/provider-keys \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "anthropic",
    "label":    "my-anthropic-key",
    "secret":   "sk-ant-...",
    "workspace_id": "acme-support"
  }'

Secrets are stored encrypted and never returned by any API response.

OpenClaw integration

Recommended: install the Tokaroo provider plugin first. Keep the manual OpenAI-compatible config path as fallback if you prefer to wire it yourself.

1. Install the plugin

openclaw plugins install clawhub:tokaroo-openclaw-provider

2. Set your key

export TOKAROO_API_KEY=tok_...

3. Restart OpenClaw

openclaw restart

The plugin path is the native OpenClaw install flow. If you want the raw provider block instead, use the manual fallback below.

Manual fallback: add to ~/.openclaw/openclaw.json

{
  models: {
    mode: "merge",
    providers: {
      tokaroo: {
        baseUrl: "https://api.tokaroo.com/v1",
        apiKey:  "${TOKAROO_API_KEY}",
        api:     "openai-completions",
        models: [
          { id: "auto", name: "Tokaroo Auto — best value, automatic routing" },
          { id: "fast", name: "Tokaroo Fast — speed-optimized, low latency"  },
          { id: "max",  name: "Tokaroo Max  — highest capability"             },
        ]
      }
    }
  },
  agents: {
    defaults: { model: "tokaroo/auto" }
  }
}

Manual fallback: set your key

export TOKAROO_API_KEY=tok_...

Manual fallback: restart OpenClaw

openclaw restart

That's it. See the full OpenClaw guide for local model hybrid setups.

NemoClaw integration

Add Tokaroo as a provider in your NemoClaw config. Works the same way as any OpenAI-compatible provider.

1. Add to your nemo_config.json

{
  models: {
    mode: "merge",
    providers: {
      tokaroo: {
        baseUrl: "https://api.tokaroo.com/v1",
        apiKey:  "${TOKAROO_API_KEY}",
        api:     "openai-completions",
        models: [
          { id: "auto", name: "Tokaroo Auto — best value, automatic routing" },
          { id: "fast", name: "Tokaroo Fast — speed-optimized, low latency"  },
          { id: "max",  name: "Tokaroo Max  — highest capability"             },
        ]
      }
    }
  },
  agents: {
    defaults: { model: "tokaroo/auto" }
  }
}

2. Set your key

export TOKAROO_API_KEY=tok_...

See the full NemoClaw guide for enterprise and on-prem setups.

Local endpoints

Connect a self-hosted OpenAI-compatible endpoint such as Ollama or vLLM. Tokaroo can prioritize your endpoint and automatically fall back to Tokaroo-optimized managed routing when needed. Requests successfully served by your self-hosted endpoint are charged at 5% of baseline.

For hosted Tokaroo, the endpoint must be reachable from the Tokaroo server. A laptop-local http://localhost:11434 only works if you expose it through a reachable tunnel, VPN, or deployed host.

Tokaroo can also use compatible self-hosted endpoint capabilities where available and fall back to Tokaroo-managed execution automatically. The exact routing and optimization behavior remains part of Tokaroo's managed layer.

Local-endpoint management is a dashboard/session-authenticated surface, not a tok_... runtime API-key surface.

Register an endpoint

curl -X POST https://api.tokaroo.com/v1/local-endpoints \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "url":           "https://ollama.example.internal:11434",
    "label":         "home-ollama",
    "default_model": "llama3.2",
    "headers":       { "Authorization": "Bearer internal-token" },
    "workspace_id":  "acme-support"
  }'

Tokaroo performs a health check and model discovery on registration. If the endpoint is unreachable at registration time it is saved with status unreachable and requests fall back to managed until it responds.

Manage endpoints in the dashboard under Local endpoints, or via GET / PATCH / DELETE /v1/local-endpoints, POST /v1/local-endpoints/:id/recheck, and GET /v1/local-endpoints/:id/models. Listing accepts an optional workspace_id filter.

Supported runtimes: Ollama, vLLM, LocalAI, any OpenAI-compatible server.

API Reference

Authentication

Tokaroo uses two credential types.

Runtime API key

Authorization: Bearer tok_...

Dashboard session

Authorization: Bearer <dashboard-session-token>
Detail
tok_... API keysUse for runtime requests like chat, images, audio, video, embeddings, balance, model listing, and usage reads within that key's scope.
Dashboard sessionRequired for billing management, API-key management, workspace management, provider-key management, and local-endpoint management.
Reporting scopeSession auth can inspect account-wide usage or filter by api_key_id / workspace_id. API keys are implicitly scoped to their own key and workspace.
WarningNever share either credential type — treat them like passwords.

Chat completions

POST /v1/chat/completions

OpenAI-compatible. Supports streaming, tool use, and vision.

Request

interface ChatRequest {
  model:        "auto" | "fast" | "max";
  messages:     { role: "system" | "user" | "assistant"; content: string }[];
  max_tokens?:  number;
  temperature?: number;   // 0–2, default 1
  stream?:      boolean;  // default false
  tools?:       Tool[];
}

Response

interface ChatResponse {
  id:      string;
  object:  "chat.completion";
  created: number;
  model:   "auto" | "fast" | "max";  // echoes the requested routing mode
  choices: {
    index:         number;
    message:       { role: "assistant"; content: string };
    finish_reason: "stop" | "length" | "tool_calls";
  }[];
  usage: {
    prompt_tokens:     number;
    completion_tokens: number;
    total_tokens:      number;
  };
}

Response headers

HeaderValue
x-tokaroo-cachehit or miss
x-tokaroo-costUSD charged for this request

Examples

curl https://api.tokaroo.com/v1/chat/completions \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'

For streaming, set stream: true — the response uses server-sent events (SSE) in the OpenAI format.

Try it

Interactive playgroundRequests go directly from your browser to api.tokaroo.com
Your key is only used in your browser. It is sent only to api.tokaroo.com to make the API call — never to our dashboard servers.

Image generation

POST /v1/images/generations

Generate images with AI. OpenAI-compatible — same request format as DALL-E.

Request

interface ImageRequest {
  prompt:           string;
  model?:           "auto" | "fast" | "max";
  n?:               number;   // number of images, default 1
  size?:            string;   // "1024x1024" | "1024x1792" | "1792x1024"
  quality?:         string;   // "standard" | "hd"
  response_format?: string;   // "url" | "b64_json"
  style?:           string;   // "vivid" | "natural"
}

Response

{
  created: number;
  data: {
    url?:            string;
    b64_json?:       string;
    revised_prompt?: string;
  }[];
}

Examples

curl https://api.tokaroo.com/v1/images/generations \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A sunset over mountains, oil painting style",
    "model": "auto",
    "size": "1024x1024",
    "quality": "hd"
  }'

Text-to-speech

POST /v1/audio/speech

Convert text to spoken audio. Returns raw audio binary — not JSON. Set the response_format to control the audio codec.

Request

interface SpeechRequest {
  input:            string;   // text to speak
  model?:           "auto" | "fast" | "max";
  voice?:           string;   // "alloy" | "echo" | "fable" | "onyx" | "nova" | "shimmer"
  response_format?: string;   // "mp3" | "opus" | "aac" | "flac" | "wav" | "pcm"
  speed?:           number;   // 0.25–4.0, default 1
}

Response

Raw audio bytes with Content-Type header (e.g. audio/mpeg). Save directly to a file or stream to the client.

Examples

curl https://api.tokaroo.com/v1/audio/speech \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "input": "Hello, welcome to Tokaroo!",
    "model": "auto",
    "voice": "nova"
  }' \
  --output speech.mp3

Speech-to-text

POST /v1/audio/transcriptions

Transcribe audio to text. Accepts multipart/form-data with an audio file (max 25 MB).

Request

// multipart/form-data fields:
file:             File;     // audio file (mp3, wav, m4a, webm, etc.)
model?:           string;   // "auto" | "fast" | "max"
language?:        string;   // ISO 639-1 code, e.g. "en"
response_format?: string;   // "json" | "text" | "srt" | "vtt"
temperature?:     number;   // 0–1

Response

{
  text: string;
}

Examples

curl https://api.tokaroo.com/v1/audio/transcriptions \
  -H "Authorization: Bearer tok_..." \
  -F file=@recording.mp3 \
  -F model=auto

Video generation

POST /v1/videos/generations

Generate videos from text prompts. Video generation is asynchronous — the POST returns a job ID, then poll with GET until complete.

Submit request

interface VideoRequest {
  prompt:        string;
  model?:        "auto" | "fast" | "max";
  duration?:     number;   // seconds of video, default 5
  size?:         string;
  aspect_ratio?: string;
  fps?:          number;
}

Response (both POST and GET)

{
  id:      string;
  status:  "processing" | "completed" | "failed";
  url?:    string;   // present when status is "completed"
}

Poll for status

GET /v1/videos/generations/:id

Returns the same response shape. Poll until status is "completed" or "failed".

Examples

curl https://api.tokaroo.com/v1/videos/generations \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "A timelapse of a flower blooming",
    "model": "auto",
    "duration": 5
  }'

Embeddings

POST /v1/embeddings

Request

interface EmbeddingsRequest {
  input:  string | string[];
  model?: string;   // optional — Tokaroo picks best available
}

Response

Standard OpenAI embeddings shape. Default model: text-embedding-3-small (1536 dims). Tokaroo may use compatible self-hosted endpoint capabilities when available and otherwise continues through Tokaroo-managed execution automatically.

Examples

curl https://api.tokaroo.com/v1/embeddings \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "input": "The quick brown fox"
  }'

URL intelligence

Tokaroo URL intelligence is a deterministic inspection pipeline for redirects, DNS, TLS, RDAP age, parked-domain detection, and brand impersonation. Use it before browsing links, onboarding customer domains, or letting an agent visit external URLs.

POST /v1/url/analyse

Authenticated. Analyses one URL and returns signals, score, decision, and derived observations.

// Request
{
  url: string;
  profile?: string;
  weights?: Record<string, number>;
}

// Response
{
  id: string;
  url: string;
  normalized_url: string;
  domain: string;
  final_url: string | null;
  score: number;
  decision: "allow" | "review" | "warn" | "block";
  signals: { id: string; weight: number; value: string | number | boolean; detail?: string }[];
  observations: {
    final_status_code: number | null;
    redirect_hops: number;
    tls_valid: boolean | null;
    mx_records: number | null;
    ns_records: number | null;
    domain_age_days: number | null;
    expires_in_days: number | null;
    parked: boolean;
    brand_match: object | null;
  };
  cache: { hit: boolean; status: "miss" | "signal_cache"; age_ms: number | null };
}
curl https://api.tokaroo.com/v1/url/analyse \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://paypa1-login.example/security-check",
    "profile": "affiliate-review"
  }'
POST /v1/url/batch

Authenticated. Analyses up to 100 URLs in one request, persists a batch record, and can optionally deliver a completion webhook.

// Request
{
  urls: string[];
  label?: string;
  profile?: string;
  weights?: Record<string, number>;
  webhook_url?: string;
  webhook_secret?: string;
}

// Response
{
  id: string;
  object: "url.batch";
  status: string;
  total_urls: number;
  completed_urls: number;
  failed_urls: number;
  summary: {
    average_score: number;
    cache_hit_rate: number;
    decisions: { allow: number; review: number; warn: number; block: number };
  };
  errors: { input_index: number; url: string; error: string }[];
  results: UrlAnalysisResult[];
  webhook: {
    url: string | null;
    delivered_at: string | null;
    last_error: string | null;
  };
}
curl https://api.tokaroo.com/v1/url/batch \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "label": "affiliate-import-2026-04-13",
    "urls": [
      "https://example.com",
      "https://paypa1-login.example/security-check"
    ],
    "webhook_url": "https://your-app.example/webhooks/tokaroo-url"
  }'
GET /v1/url/batches • GET /v1/url/batches/:id

Authenticated. List recent batches or fetch one persisted batch by id. Useful for polling, dashboards, and webhook retry diagnostics.

GET /v1/url/history • GET /v1/url/stats

Authenticated. Read recent URL checks and aggregate stats including decision breakdown, top signals, top domains, and cache hit rate.

GET /v1/url/profiles • POST /v1/url/profiles • PATCH /v1/url/profiles/:id

Session auth required for create and update. Profiles let you tune signal weights for use cases like affiliate review, inbound lead screening, or agent browsing.

Models

Tokaroo currently tracks 209 models from Anthropic, Google, Groq, and OpenAI. The live table below comes from the public pricing catalog and shows the provider/model universe Tokaroo tracks for routing, comparisons, and research.

Loading models...

Runtime model IDs

GET /v1/models

Authenticated. Returns the Tokaroo model IDs clients can request directly. Tokaroo also accepts many native provider model names as compatibility aliases, but the managed product surface remains these three tiers.

{
  object: "list";
  data: [
    { id: "auto", object: "model", owned_by: "tokaroo", created: 0 },
    { id: "fast", object: "model", owned_by: "tokaroo", created: 0 },
    { id: "max",  object: "model", owned_by: "tokaroo", created: 0 }
  ];
}
curl https://api.tokaroo.com/v1/models \
  -H "Authorization: Bearer tok_..."

Reference pricing catalog

GET /v1/pricing

Public. Returns the upstream provider/model catalog Tokaroo tracks for pricing, capabilities, and context windows. This is Tokaroo's routing universe and reference data, not your actual billed price.

{
  object: "list";
  data: {
    id: string;
    name: string;
    provider: string;
    input_per_1m: number;
    output_per_1m: number;
    capabilities: string[];
    context_window: number;
  }[];
  researched_at: string | null;
}
curl https://api.tokaroo.com/v1/pricing

Your actual billed price is dynamic per request. See x-tokaroo-cost, usage history, and balance deductions for what Tokaroo actually charged.

Balance & billing

GET /v1/balance
interface BalanceResponse {
  balance_usd: number;
  currency: "usd";
}
curl https://api.tokaroo.com/v1/balance \
  -H "Authorization: Bearer tok_..."
POST /v1/billing/checkout

Session auth required. Creates a Stripe Checkout session to pre-load credits. The first activation flow is explicit: accept the activation checkbox in Payments, save a card, enable auto pay for the chosen amount, then complete the payment.

// Request
{ amount_usd: number }   // service minimum applies

// Response
{ url: string; session_id: string }
curl https://api.tokaroo.com/v1/billing/checkout \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{ "amount_usd": 25 }'
POST /v1/billing/add

Session auth required. Charges the saved card directly and adds prepaid balance immediately. If this is the first activation payment and all activation requirements are satisfied, Tokaroo also returns the first API key in the same response.

// Request
{
  amount_usd?: number;
  activate_key_name?: string;
}

// Response
{
  ok: true;
  balance_usd: number;
  activated_key: {
    key: string;
    name: string;
    prefix: string;
  } | null;
}
GET /v1/billing/activation

Session auth required. Returns the current first-key activation checklist state.

{
  key_count: number;
  has_api_key: boolean;
  requires_first_key_activation: boolean;
  balance_usd: number;
  funded: boolean;
  has_saved_card: boolean;
  auto_pay_enabled: boolean;
  auto_pay_amount_usd: number | null;
  activation_terms_accepted: boolean;
  activation_terms_accepted_at: string | null;
  activation_terms_version: string;
  missing_requirements: ("payment" | "saved_card" | "auto_pay" | "terms_acceptance")[];
  ready_for_activation_payment: boolean;
  ready_for_first_key: boolean;
  payments_url: string;
  terms_url: string;
}
POST /v1/billing/activation/accept

Session auth required. Records acceptance of the activation terms for the current account.

// Request
{ accepted: true }

// Response
{
  ok: true;
  activation: ActivationStatus;
}
GET / PUT /v1/billing/autopay

Session auth required. Reads or updates auto pay settings. First-key activation requires auto pay to be enabled. After your first key is active, you can disable auto pay and keep using remaining balance until it is depleted.

// GET response
{
  enabled: boolean;
  amount_usd: number;
}

// PUT request
{
  enabled?: boolean;
  amount_usd?: number;
}
GET / DELETE /v1/billing/card

Session auth required. Reads or removes the saved card used for auto pay and direct top-ups.

// GET response
{
  card: {
    brand: string;
    last4: string;
    exp_month: number;
    exp_year: number;
  } | null;
}

// DELETE response
{ ok: true }

Tokaroo uses a prepaid balance. Managed requests deduct from that balance based on the work actually executed. No subscription. No seat fee. Service minimums may apply at checkout.

First API key activation requires four things: a successful payment, a saved card, auto pay enabled, and accepted terms. Tokaroo records those requirements through the dashboard billing flow and blocks first-key creation until they are all complete. The managed runtime surface remains auto, fast, and max; the larger pricing catalog is Tokaroo's routing universe, not the primary managed selection surface.

BYOK is optional and advanced. When Tokaroo uses your own provider connection, the underlying provider may bill you directly while Tokaroo charges only the optimization layer.

API keys

API-key management requires a dashboard session, not a tok_... runtime API key. Creating the first key requires the full activation checklist: successful payment, saved card, auto pay enabled, and accepted terms.

POST /v1/keys
// Request
{
  name?: string;
  metadata?: Record<string, unknown>;
  workspace_id?: string;
  workspace_name?: string;
}

// Response
{
  id: string;
  key: string;   // "tok_..." — shown once only, save immediately
  name: string;
  prefix: string;
  metadata: Record<string, unknown>;
  workspace_id: string | null;
  workspace_name: string | null;
}
curl -X POST https://api.tokaroo.com/v1/keys \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{ "name": "production", "workspace_id": "acme-support" }'
GET /v1/keys
{
  object: "list";
  data: {
    id: string;
    name: string;
    metadata: Record<string, unknown>;
    workspace_id: string | null;
    workspace_name: string | null;
    keyPrefix: string;
    lastUsedAt: string | null;
    createdAt: string;
  }[];
}
// Secrets are never returned
curl https://api.tokaroo.com/v1/keys \
  -H "Authorization: Bearer <dashboard-session-token>"
DELETE /v1/keys/:id

Returns JSON confirmation. The key is immediately revoked.

{ deleted: true; id: string }
curl -X DELETE https://api.tokaroo.com/v1/keys/:id \
  -H "Authorization: Bearer <dashboard-session-token>"

Workspaces

Workspaces let one Tokaroo account contain named sub-areas such as customers, teams, or environments. API keys, provider keys, local endpoints, usage, and stats can all be bound to a workspace. These routes require a dashboard session.

POST /v1/workspaces
// Request
{
  workspace_id: string;
  name: string;
  metadata?: Record<string, unknown>;
  monthly_budget_usd?: number | null;
}

// Response
{
  id: string;
  workspace_id: string;
  name: string;
  status: string;
  metadata: Record<string, unknown>;
  monthly_budget_usd: number | null;
}
curl -X POST https://api.tokaroo.com/v1/workspaces \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "acme-support",
    "name": "Acme Support",
    "monthly_budget_usd": 250
  }'
GET /v1/workspaces
{
  object: "list";
  data: {
    id: string;
    workspace_id: string;
    name: string;
    status: string;
    metadata: Record<string, unknown>;
    monthly_budget_usd: number | null;
    created_at: string;
    updated_at: string;
  }[];
}
PATCH /v1/workspaces/:workspace_id
// Request
{
  name?: string;
  status?: string;
  metadata?: Record<string, unknown>;
  monthly_budget_usd?: number | null;
}

// Response
{
  id: string;
  workspace_id: string;
  name: string;
  status: string;
  metadata: Record<string, unknown>;
  monthly_budget_usd: number | null;
  updated_at: string;
}

Provider keys

Provider keys are optional and advanced. They are primarily used for BYOK routing, free-tier arbitrage, and max-mode baseline personalization. These routes require a dashboard session.

POST /v1/provider-keys
// Request
{
  provider: "openai" | "anthropic" | "google" | "groq";
  label?: string;
  secret: string;   // stored encrypted, never returned
  workspace_id?: string;
}

// Response
{
  created: true;
  provider: string;
  label: string;
  workspace_id: string | null;
}
curl -X POST https://api.tokaroo.com/v1/provider-keys \
  -H "Authorization: Bearer <dashboard-session-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "anthropic",
    "label":    "my-anthropic-key",
    "secret":   "sk-ant-...",
    "workspace_id": "acme-support"
  }'
GET /v1/provider-keys

Optional query param: workspace_id.

{
  object: "list";
  data: {
    id: string;
    provider: string;
    label: string;
    workspace_id: string | null;
    workspace_name: string | null;
    status: string;
    last_success_at: string | null;
    last_failure_at: string | null;
    last_failure_code: string | null;
    estimated_reset_at: string | null;
  }[];
}
DELETE /v1/provider-keys/:id

Returns JSON confirmation.

{ deleted: true; id: string }

Manage provider keys in the dashboard. Secrets are stored encrypted and never returned by any API response.

Usage history

GET /v1/usage

Query params: limit (default 100, max 1000), before (cursor — created_at ISO string), api_key_id, workspace_id, request_type.

interface UsageRow {
  id: string;
  api_key_id: string | null;
  workspace_id: string | null;
  workspace_name: string | null;
  metadata: Record<string, unknown>;
  tier: "auto" | "fast" | "max";
  model: "auto" | "fast" | "max";
  provider: string;
  routing_method: string;
  execution_source: string;
  billing_mode: string;
  request_type: string;
  input_tokens: number;
  output_tokens: number;
  actual_cost_usd: number;
  charged_usd: number;
  baseline_usd: number;
  latency_ms: number | null;
  created_at: string;
}

// Response: { data: UsageRow[], has_more: boolean }
curl "https://api.tokaroo.com/v1/usage?limit=20" \
  -H "Authorization: Bearer tok_..."

Tokaroo never exposes which model or provider handled your request. What you see is what matters: tokens, cost, latency, and tier.

Grouped breakdown

GET /v1/usage/breakdown

Query params: group_by (api_key or workspace), days, api_key_id, workspace_id, request_type.

{
  object: "list";
  group_by: "api_key" | "workspace";
  window_days: number;
  data: Record<string, unknown>[];
}
curl "https://api.tokaroo.com/v1/usage/breakdown?group_by=workspace&days=30" \
  -H "Authorization: Bearer <dashboard-session-token>"

Stats

GET /v1/stats

Summary analytics for the whole account or a filtered workspace/key scope. Query params: days, daily_days, api_key_id, workspace_id, request_type.

{
  window_days: number;
  daily_window_days: number;
  api_key_id: string | null;
  workspace_id: string | null;
  requests: number;
  total_tokens: number;
  avg_latency_ms: number;
  charged_usd: number;
  actual_cost_usd: number;
  tier_breakdown: Record<"auto" | "fast" | "max", number>;
  request_type_breakdown: Record<string, number>;
  daily: {
    date: string;
    requests: number;
    tokens: number;
    charged_usd: number;
    actual_cost_usd: number;
  }[];
}
curl "https://api.tokaroo.com/v1/stats?days=30&workspace_id=acme-support" \
  -H "Authorization: Bearer <dashboard-session-token>"

Memory Vault API

Memory Vault is the structured memory engine inside Knowledge Base. Use it for durable notes, entities, graph links, event history, batch writes, and context packs.

RoutePurpose
GET /v1/memory/notesList durable memory notes. Filter by workspace, status, kind, and limit.
POST /v1/memory/notesCreate or upsert a memory note with title, content, kind, tags, importance, confidence, metadata, and optional external_source/external_id.
PATCH /v1/memory/notes/:idEdit note content, status, tags, metadata, importance, confidence, or external identity.
POST /v1/memory/notes/:id/approveApprove a pending/inferred memory.
DELETE /v1/memory/notes/:idForget a memory without destroying audit history.
GET/POST/PATCH /v1/memory/entitiesRead and write graph entities such as users, workspaces, projects, tools, companies, and missions.
GET/POST/DELETE /v1/memory/linksRead and write relationships between notes and entities.
GET/POST /v1/memory/eventsRecord summarized history such as onboarding turns, mission lifecycle events, tool outcomes, and memory-control actions.
POST /v1/memory/batchWrite notes, entities, links, events, and feedback together with stable external IDs.
POST /v1/memory/context-packReturn budgeted, ranked context with memories, entities, links, events, sources, citations, and telemetry.
POST /v1/memory/contextCompatibility endpoint for simpler context retrieval.
// POST /v1/memory/context-pack
{
  query: string;
  workspace_id?: string;
  max_tokens?: number;
  include_sources?: boolean;
  metadata?: {
    mission_id?: string;
    gwen_task_id?: string;
    tokaroo_mission_uuid?: string;
  };
}

Context-pack telemetry records which memories, sources, and links were included. Feedback can later mark those items helpful or unhelpful so ranking improves over time.

Action guardrails

Action guardrails let agents check risky tool calls before execution. Apps can model actions by app, mission, task, actor, tool, operation, access level, risk level, and requested spend.

RoutePurpose
GET /v1/actions/policiesList allow, approval, or deny policies.
POST /v1/actions/policiesCreate or upsert a policy for a tool/action/risk/spend scope.
POST /v1/actions/checkAsk Tokaroo whether an action is allowed, denied, or requires approval before a tool runs.
GET /v1/actions/approvalsList pending/decided approvals for a workspace or mission.
POST /v1/actions/approvals/:id/decideApprove or reject a pending action approval.
GET /v1/actions/auditRead action audit events.
POST /v1/actions/auditRecord executed, failed, blocked, approved, or rejected action events.
// POST /v1/actions/check
{
  app: "gwendolyn";
  mission_id?: string; // parent mission/external goal id
  task_id?: string;    // child work-unit id inside the mission
  actor_ref?: string;
  tool_ref: string;
  action_type: "read" | "write" | "spend" | "communication" | "code" | string;
  operation?: string;
  access_level?: "read_only" | "write" | "destructive" | string;
  risk_level?: "low" | "medium" | "high" | "critical";
  requested_amount_usd?: number;
  external_source?: string;
  external_id?: string;
  metadata?: Record<string, unknown>;
}

Meta-harness

Tokaroo records the harness around the model: retrieved documents, retrieved memories, context assembly, model call usage, feedback, and reflections. This is the learning layer that lets agents improve without fine-tuning a model first.

A mission is the parent goal. A task is a child work unit inside that mission. Use mission_id for the stable external mission id, use task_id for a Gwen task or similar unit, and use the returned Tokaroo mission UUID only in the /v1/harness/missions/:id URL path.

RoutePurpose
GET /v1/harness/versionsList active/draft harness configs for context, retrieval, and reflection behavior.
POST /v1/harness/versionsCreate a candidate harness config for later evals and rollout.
GET /v1/harness/missionsList agent missions mirrored into Tokaroo with status, progress, app, workspace, and outcome fields.
POST /v1/harness/missionsCreate or upsert a mission using stable mission_id or external_source/external_id.
GET /v1/harness/missions/:idRead mission detail including steps, artifacts, approvals, action audit, usage, and traces.
PATCH /v1/harness/missions/:idUpdate mission progress, status, outcome, success, risk, or metadata.
POST /v1/harness/missions/:id/stepsCreate or upsert a mission ledger step.
PATCH /v1/harness/missions/:id/steps/:step_idUpdate a ledger step.
POST /v1/harness/missions/:id/artifactsAttach produced outputs such as emails, files, docs, URLs, receipts, plans, reports, or code summaries.
GET /v1/harness/tracesRead the observation stream: inputs, outputs, retrieved context, costs, outcomes.
POST /v1/harness/tracesRecord an external observation from an agent/app such as Gwen.
POST /v1/harness/feedbackAttach correction, rating, remember, or forget signals to a trace.
GET /v1/harness/reflectionsRead higher-level insights synthesized from traces.
POST /v1/harness/reflections/generateGenerate a reflection from selected traces and optionally store it as memory.
// Mission/task identity model
{
  // POST /v1/harness/missions
  mission_id: "mission_123"; // parent goal
  external_source: "gwendolyn";
  external_id: "mission:mission_123";
}

{
  // POST /v1/harness/missions/:tokaroo_mission_uuid/steps
  step_ref: "task_456"; // or "task_456:step_1"
  task_id: "task_456";
  metadata: {
    mission_id: "mission_123",
    task_id: "task_456"
  }
}
interface HarnessTrace {
  id: string;
  trace_type: "knowledge_reply" | "docs_query" | "external_observation" | string;
  harness_version_id: string | null;
  assistant_id: string | null;
  thread_id: string | null;
  input_summary: string | null;
  output_summary: string | null;
  context_strategy: string;
  context: {
    documents?: Array<{ id: string; score: number }>;
    memories?: Array<{
      id: string;
      score: number;
      relevance_score: number;
      recency_score: number;
      importance_score: number;
      confidence_score: number;
    }>;
  };
  charged_usd: number;
  outcome: "unknown" | "success" | "failure" | string;
  feedback_score: number | null;
}
curl https://api.tokaroo.com/v1/harness/feedback \
  -H "Authorization: Bearer tok_..." \
  -H "Content-Type: application/json" \
  -d '{
    "trace_id": "trace_uuid",
    "rating": 1,
    "outcome": "success",
    "remember_text": "User prefers concise answers with concrete next steps."
  }'

Knowledge replies and Docs project chat automatically create harness traces. Gwen and other agents can also write traces directly, then send feedback so Tokaroo can improve retrieval, context assembly, routing, and future memory selection.

Error codes

Errors are JSON objects with an error field. Some routes also include a human-readable message and route-specific metadata.

interface ErrorResponse {
  error: string;
  message?: string;
  payment_url?: string;
  balance_usd?: number;
  required_usd?: number;
}
HTTPCodeMeaningFix
400variesMalformed request body, unsupported values, or scope mismatchCheck required fields, filters, and request types.
401Missing Authorization header / Invalid or expired credentialMissing or invalid credentialCheck the Authorization header and credential type.
402billing_required / payment_failed / no_cardNo active balance or a payment step failedComplete payment setup in the dashboard.
403Session authentication required ...Management route requires dashboard session or the requested scope is forbiddenUse a dashboard session or narrow the request scope.
404Workspace not found / API key not foundRequested resource does not exist in this accountCheck the resource id or workspace_id.
429Rate limit exceededToo many requests on this API keyBack off and retry with exponential backoff.
503Provider ... not configured / No ... provider configuredThe required provider path was unavailableRetry later or use a different route/mode.
500variesSomething went wrong on our endRetry with exponential backoff.

The exact error string is not fully normalized across every route yet, so match on HTTP status first and then inspect the error payload where needed.

Rate limits

LimitValue
Requests per minute (per API key)Default 60, configurable per key
Dashboard/session trafficHigher internal allowance; intended for management, not runtime inference

Rate limit headers are returned on every response:

HeaderDescription
x-ratelimit-limitMaximum requests per minute
x-ratelimit-remainingRequests remaining in the current window

When you hit the limit you receive a 429 response with error: "Rate limit exceeded". Back off and retry later.