Docs navigation
Docs
Base URL: https://api.tokaroo.com
Tokaroo provides model routing plus Knowledge Base, Sources, Mission Harness, Action Guardrails, telemetry, and billing primitives for AI apps and agents.
Quickstart
Tokaroo is an AI routing gateway that saves you up to 95% on every request — automatically. Connect in under 60 seconds via the SDK or a one-line change to your existing OpenAI code.
Tokaroo is a hosted cloud service. One key, every model, full optimization engine. Bring Your Own Keys (BYOK) is optional.
Tokaroo applies exact-match and semantic caching internally to reduce repeat cost and latency. Shared cache is currently global across Tokaroo. Enterprise isolation controls are planned.
Install the SDK
npm install tokarooMake your first request
Already using OpenAI? One line to swap.
// Before
const openai = new OpenAI({ apiKey: process.env.OPENAI_KEY });
// After — one line change
const openai = new OpenAI({
baseURL: "https://api.tokaroo.com/v1",
apiKey: process.env.TOKAROO_KEY,
});
// Everything else stays the samePricing and usage-based billing
Tokaroo is usage-based. Users add credits, send requests through Tokaroo, and spend is deducted as work is performed. There are no seats and no subscription is required for API usage.
| Area | What is metered | Where it appears |
|---|---|---|
| Routing | Tokens, requests, media, embeddings, latency, cache/fallback behaviour | Usage dashboard and stats APIs |
| Knowledge Base | Context packs, retrieval, memory writes, events, entities, links, feedback | Knowledge Base dashboard and memory APIs |
| Sources | Ingestion, document chunking, retrieval, and citations | Sources dashboard and retrieval APIs |
| Docs Studio | Generated documents, reports, specs, guides, project chat, artifacts | Docs Studio dashboard |
| Mission Harness | Missions, steps, artifacts, traces, outcomes, action audits | Admin harness and mission APIs |
| Action Guardrails | Policy checks, approvals, decisions, and audit events | Action APIs and admin views |
Customers should see simple balance, usage, and savings. Admins should see provider cost, customer charge, margin, internal usage, shadow-test spend, and reconciliation details.
Modes are still the simplest pricing control: fast is speed/cost optimized, auto optimizes for value and can escalate when needed, and max uses the highest-capability route.
See the public pricing page for the user-facing explanation.
Knowledge Base
Tokaroo Knowledge Base is the context layer for agents and AI apps. It combines durable memory, source-grounded retrieval, entities, graph links, events, and feedback so every request can use the right context without dumping raw history into prompts.
The product word is Knowledge Base. The core engine inside it is Memory Vault. Sources are the files, URLs, docs, specs, repos, emails, and records that become indexed documents and citations.
| Layer | What it does | Best for |
|---|---|---|
| Memory Vault | Stores facts, preferences, decisions, lessons, entities, links, and events | Long-lived agents and workspace memory |
| Sources | Converts files, URLs, docs, specs, and records into indexed documents and chunks | Grounded answers with citations |
| Context packs | Ranks memory, sources, links, and recent events into a budgeted context payload | Agent replies, actions, and mission steps |
| Feedback | Records what helped, failed, or should be remembered or forgotten | Improving future context selection |
Knowledge Base runs on top of Tokaroo routing, so replies still go through auto, fast, or max. Context changes what the model sees; routing still decides how the model call executes.
Storage is scoped to your Tokaroo account and optional workspace. Apps such as Gwen can write memories, events, sources, mission metadata, and feedback back into Tokaroo so future work starts with better context.
Sources and documents
Sources are where knowledge came from. Documents are Tokaroo's indexed representation of a source. Chunks are the searchable pieces, and citations point back to the source/document/chunk that helped an answer.
| Term | Meaning | Examples |
|---|---|---|
| Source | Original thing Tokaroo can use as context | PDF, URL, Google Doc, email thread, CRM record, repo, transcript |
| Document | Stored and indexed source representation | Security policy, pricing guide, product spec |
| Chunk | Retrieval unit embedded and ranked by relevance | A section, paragraph, or page excerpt |
| Citation | Reference to what context was used | Document title, source URL, chunk score |
Use Sources when you need grounded answers from written material. Use Memory Vault when the agent should remember durable facts, preferences, lessons, and relationships.
Docs Studio
Docs Studio is the document and artifact workbench. It can support Tokaroo's own technical docs, but it is also useful for generated work product: reports, specs, guides, proposals, plans, and machine-readable artifacts.
Docs Studio can import sources, build pages/artifacts, and chat against the generated corpus. Agents such as Gwen can use it as a document creation surface while Sources and Knowledge Base provide the underlying retrieval and memory.
auto · fast · max
Every request runs in one of three modes. Set it per request or configure a default in your dashboard.
| Mode | Behaviour | Best for |
|---|---|---|
"auto" | Best value — Tokaroo picks the right model per request | Most workloads |
"fast" | Speed-optimized — lowest latency tier | Real-time features, interactive agents |
"max" | Maximum capability — most powerful models, no shortcuts | Complex reasoning, hardest tasks |
These are Tokaroo's routing modes. The underlying provider and model are never exposed — that's the black box.
Tokaroo runs on Tokaroo — our internal systems use the same routing engine for their own AI calls. The code path you use is the code path we trust.
Bring Your Own Keys (BYOK)
BYOK is an advanced, optional feature. Tokaroo works normally without it.
Today BYOK is mainly useful for two things:
- Google + Groq free-tier arbitrage — Tokaroo can use your connected keys for eligible BYOK routing, which means you pay Tokaroo only the optimization layer on that traffic instead of full managed rates.
- OpenAI + Anthropic baseline personalization — these connections are used to keep savings comparisons honest for max-mode requests and enterprise benchmarking.
Keep BYOK in the background. It is not the core Tokaroo story and most users never need to touch it.
Via the dashboard: go to tokaroo.com/keys → Provider connections → add your key.
Via the API: provider-key management requires a logged-in dashboard session.
Secrets are stored encrypted and never returned by any API response.
OpenClaw integration
Recommended: install the Tokaroo provider plugin first. Keep the manual OpenAI-compatible config path as fallback if you prefer to wire it yourself.
1. Install the plugin
openclaw plugins install clawhub:tokaroo-openclaw-provider2. Set your key
export TOKAROO_API_KEY=tok_...3. Restart OpenClaw
openclaw restartThe plugin path is the native OpenClaw install flow. If you want the raw provider block instead, use the manual fallback below.
Manual fallback: add to ~/.openclaw/openclaw.json
{
models: {
mode: "merge",
providers: {
tokaroo: {
baseUrl: "https://api.tokaroo.com/v1",
apiKey: "${TOKAROO_API_KEY}",
api: "openai-completions",
models: [
{ id: "auto", name: "Tokaroo Auto — best value, automatic routing" },
{ id: "fast", name: "Tokaroo Fast — speed-optimized, low latency" },
{ id: "max", name: "Tokaroo Max — highest capability" },
]
}
}
},
agents: {
defaults: { model: "tokaroo/auto" }
}
}Manual fallback: set your key
export TOKAROO_API_KEY=tok_...Manual fallback: restart OpenClaw
openclaw restartThat's it. See the full OpenClaw guide for local model hybrid setups.
NemoClaw integration
Add Tokaroo as a provider in your NemoClaw config. Works the same way as any OpenAI-compatible provider.
1. Add to your nemo_config.json
{
models: {
mode: "merge",
providers: {
tokaroo: {
baseUrl: "https://api.tokaroo.com/v1",
apiKey: "${TOKAROO_API_KEY}",
api: "openai-completions",
models: [
{ id: "auto", name: "Tokaroo Auto — best value, automatic routing" },
{ id: "fast", name: "Tokaroo Fast — speed-optimized, low latency" },
{ id: "max", name: "Tokaroo Max — highest capability" },
]
}
}
},
agents: {
defaults: { model: "tokaroo/auto" }
}
}2. Set your key
export TOKAROO_API_KEY=tok_...See the full NemoClaw guide for enterprise and on-prem setups.
Local endpoints
Connect a self-hosted OpenAI-compatible endpoint such as Ollama or vLLM. Tokaroo can prioritize your endpoint and automatically fall back to Tokaroo-optimized managed routing when needed. Requests successfully served by your self-hosted endpoint are charged at 5% of baseline.
For hosted Tokaroo, the endpoint must be reachable from the Tokaroo server. A laptop-local http://localhost:11434 only works if you expose it through a reachable tunnel, VPN, or deployed host.
Tokaroo can also use compatible self-hosted endpoint capabilities where available and fall back to Tokaroo-managed execution automatically. The exact routing and optimization behavior remains part of Tokaroo's managed layer.
Local-endpoint management is a dashboard/session-authenticated surface, not a tok_... runtime API-key surface.
Register an endpoint
Tokaroo performs a health check and model discovery on registration. If the endpoint is unreachable at registration time it is saved with status unreachable and requests fall back to managed until it responds.
Manage endpoints in the dashboard under Local endpoints, or via GET / PATCH / DELETE /v1/local-endpoints, POST /v1/local-endpoints/:id/recheck, and GET /v1/local-endpoints/:id/models. Listing accepts an optional workspace_id filter.
Supported runtimes: Ollama, vLLM, LocalAI, any OpenAI-compatible server.
API Reference
Authentication
Tokaroo uses two credential types.
Runtime API key
Authorization: Bearer tok_...Dashboard session
Authorization: Bearer <dashboard-session-token>| Detail | |
|---|---|
| tok_... API keys | Use for runtime requests like chat, images, audio, video, embeddings, balance, model listing, and usage reads within that key's scope. |
| Dashboard session | Required for billing management, API-key management, workspace management, provider-key management, and local-endpoint management. |
| Reporting scope | Session auth can inspect account-wide usage or filter by api_key_id / workspace_id. API keys are implicitly scoped to their own key and workspace. |
| Warning | Never share either credential type — treat them like passwords. |
Chat completions
OpenAI-compatible. Supports streaming, tool use, and vision.
Request
interface ChatRequest {
model: "auto" | "fast" | "max";
messages: { role: "system" | "user" | "assistant"; content: string }[];
max_tokens?: number;
temperature?: number; // 0–2, default 1
stream?: boolean; // default false
tools?: Tool[];
}Response
interface ChatResponse {
id: string;
object: "chat.completion";
created: number;
model: "auto" | "fast" | "max"; // echoes the requested routing mode
choices: {
index: number;
message: { role: "assistant"; content: string };
finish_reason: "stop" | "length" | "tool_calls";
}[];
usage: {
prompt_tokens: number;
completion_tokens: number;
total_tokens: number;
};
}Response headers
| Header | Value |
|---|---|
| x-tokaroo-cache | hit or miss |
| x-tokaroo-cost | USD charged for this request |
Examples
For streaming, set stream: true — the response uses server-sent events (SSE) in the OpenAI format.
Try it
Image generation
Generate images with AI. OpenAI-compatible — same request format as DALL-E.
Request
interface ImageRequest {
prompt: string;
model?: "auto" | "fast" | "max";
n?: number; // number of images, default 1
size?: string; // "1024x1024" | "1024x1792" | "1792x1024"
quality?: string; // "standard" | "hd"
response_format?: string; // "url" | "b64_json"
style?: string; // "vivid" | "natural"
}Response
{
created: number;
data: {
url?: string;
b64_json?: string;
revised_prompt?: string;
}[];
}Examples
Text-to-speech
Convert text to spoken audio. Returns raw audio binary — not JSON. Set the response_format to control the audio codec.
Request
interface SpeechRequest {
input: string; // text to speak
model?: "auto" | "fast" | "max";
voice?: string; // "alloy" | "echo" | "fable" | "onyx" | "nova" | "shimmer"
response_format?: string; // "mp3" | "opus" | "aac" | "flac" | "wav" | "pcm"
speed?: number; // 0.25–4.0, default 1
}Response
Raw audio bytes with Content-Type header (e.g. audio/mpeg). Save directly to a file or stream to the client.
Examples
Speech-to-text
Transcribe audio to text. Accepts multipart/form-data with an audio file (max 25 MB).
Request
// multipart/form-data fields:
file: File; // audio file (mp3, wav, m4a, webm, etc.)
model?: string; // "auto" | "fast" | "max"
language?: string; // ISO 639-1 code, e.g. "en"
response_format?: string; // "json" | "text" | "srt" | "vtt"
temperature?: number; // 0–1Response
{
text: string;
}Examples
Video generation
Generate videos from text prompts. Video generation is asynchronous — the POST returns a job ID, then poll with GET until complete.
Submit request
interface VideoRequest {
prompt: string;
model?: "auto" | "fast" | "max";
duration?: number; // seconds of video, default 5
size?: string;
aspect_ratio?: string;
fps?: number;
}Response (both POST and GET)
{
id: string;
status: "processing" | "completed" | "failed";
url?: string; // present when status is "completed"
}Poll for status
Returns the same response shape. Poll until status is "completed" or "failed".
Examples
Embeddings
Request
interface EmbeddingsRequest {
input: string | string[];
model?: string; // optional — Tokaroo picks best available
}Response
Standard OpenAI embeddings shape. Default model: text-embedding-3-small (1536 dims). Tokaroo may use compatible self-hosted endpoint capabilities when available and otherwise continues through Tokaroo-managed execution automatically.
Examples
URL intelligence
Tokaroo URL intelligence is a deterministic inspection pipeline for redirects, DNS, TLS, RDAP age, parked-domain detection, and brand impersonation. Use it before browsing links, onboarding customer domains, or letting an agent visit external URLs.
Authenticated. Analyses one URL and returns signals, score, decision, and derived observations.
// Request
{
url: string;
profile?: string;
weights?: Record<string, number>;
}
// Response
{
id: string;
url: string;
normalized_url: string;
domain: string;
final_url: string | null;
score: number;
decision: "allow" | "review" | "warn" | "block";
signals: { id: string; weight: number; value: string | number | boolean; detail?: string }[];
observations: {
final_status_code: number | null;
redirect_hops: number;
tls_valid: boolean | null;
mx_records: number | null;
ns_records: number | null;
domain_age_days: number | null;
expires_in_days: number | null;
parked: boolean;
brand_match: object | null;
};
cache: { hit: boolean; status: "miss" | "signal_cache"; age_ms: number | null };
}Authenticated. Analyses up to 100 URLs in one request, persists a batch record, and can optionally deliver a completion webhook.
// Request
{
urls: string[];
label?: string;
profile?: string;
weights?: Record<string, number>;
webhook_url?: string;
webhook_secret?: string;
}
// Response
{
id: string;
object: "url.batch";
status: string;
total_urls: number;
completed_urls: number;
failed_urls: number;
summary: {
average_score: number;
cache_hit_rate: number;
decisions: { allow: number; review: number; warn: number; block: number };
};
errors: { input_index: number; url: string; error: string }[];
results: UrlAnalysisResult[];
webhook: {
url: string | null;
delivered_at: string | null;
last_error: string | null;
};
}Authenticated. List recent batches or fetch one persisted batch by id. Useful for polling, dashboards, and webhook retry diagnostics.
Authenticated. Read recent URL checks and aggregate stats including decision breakdown, top signals, top domains, and cache hit rate.
Session auth required for create and update. Profiles let you tune signal weights for use cases like affiliate review, inbound lead screening, or agent browsing.
Models
Tokaroo currently tracks 209 models from Anthropic, Google, Groq, and OpenAI. The live table below comes from the public pricing catalog and shows the provider/model universe Tokaroo tracks for routing, comparisons, and research.
Runtime model IDs
Authenticated. Returns the Tokaroo model IDs clients can request directly. Tokaroo also accepts many native provider model names as compatibility aliases, but the managed product surface remains these three tiers.
{
object: "list";
data: [
{ id: "auto", object: "model", owned_by: "tokaroo", created: 0 },
{ id: "fast", object: "model", owned_by: "tokaroo", created: 0 },
{ id: "max", object: "model", owned_by: "tokaroo", created: 0 }
];
}Reference pricing catalog
Public. Returns the upstream provider/model catalog Tokaroo tracks for pricing, capabilities, and context windows. This is Tokaroo's routing universe and reference data, not your actual billed price.
{
object: "list";
data: {
id: string;
name: string;
provider: string;
input_per_1m: number;
output_per_1m: number;
capabilities: string[];
context_window: number;
}[];
researched_at: string | null;
}Your actual billed price is dynamic per request. See x-tokaroo-cost, usage history, and balance deductions for what Tokaroo actually charged.
Balance & billing
interface BalanceResponse {
balance_usd: number;
currency: "usd";
}Session auth required. Creates a Stripe Checkout session to pre-load credits. The first activation flow is explicit: accept the activation checkbox in Payments, save a card, enable auto pay for the chosen amount, then complete the payment.
// Request
{ amount_usd: number } // service minimum applies
// Response
{ url: string; session_id: string }Session auth required. Charges the saved card directly and adds prepaid balance immediately. If this is the first activation payment and all activation requirements are satisfied, Tokaroo also returns the first API key in the same response.
// Request
{
amount_usd?: number;
activate_key_name?: string;
}
// Response
{
ok: true;
balance_usd: number;
activated_key: {
key: string;
name: string;
prefix: string;
} | null;
}Session auth required. Returns the current first-key activation checklist state.
{
key_count: number;
has_api_key: boolean;
requires_first_key_activation: boolean;
balance_usd: number;
funded: boolean;
has_saved_card: boolean;
auto_pay_enabled: boolean;
auto_pay_amount_usd: number | null;
activation_terms_accepted: boolean;
activation_terms_accepted_at: string | null;
activation_terms_version: string;
missing_requirements: ("payment" | "saved_card" | "auto_pay" | "terms_acceptance")[];
ready_for_activation_payment: boolean;
ready_for_first_key: boolean;
payments_url: string;
terms_url: string;
}Session auth required. Records acceptance of the activation terms for the current account.
// Request
{ accepted: true }
// Response
{
ok: true;
activation: ActivationStatus;
}Session auth required. Reads or updates auto pay settings. First-key activation requires auto pay to be enabled. After your first key is active, you can disable auto pay and keep using remaining balance until it is depleted.
// GET response
{
enabled: boolean;
amount_usd: number;
}
// PUT request
{
enabled?: boolean;
amount_usd?: number;
}Session auth required. Reads or removes the saved card used for auto pay and direct top-ups.
// GET response
{
card: {
brand: string;
last4: string;
exp_month: number;
exp_year: number;
} | null;
}
// DELETE response
{ ok: true }Tokaroo uses a prepaid balance. Managed requests deduct from that balance based on the work actually executed. No subscription. No seat fee. Service minimums may apply at checkout.
First API key activation requires four things: a successful payment, a saved card, auto pay enabled, and accepted terms. Tokaroo records those requirements through the dashboard billing flow and blocks first-key creation until they are all complete. The managed runtime surface remains auto, fast, and max; the larger pricing catalog is Tokaroo's routing universe, not the primary managed selection surface.
BYOK is optional and advanced. When Tokaroo uses your own provider connection, the underlying provider may bill you directly while Tokaroo charges only the optimization layer.
API keys
API-key management requires a dashboard session, not a tok_... runtime API key. Creating the first key requires the full activation checklist: successful payment, saved card, auto pay enabled, and accepted terms.
// Request
{
name?: string;
metadata?: Record<string, unknown>;
workspace_id?: string;
workspace_name?: string;
}
// Response
{
id: string;
key: string; // "tok_..." — shown once only, save immediately
name: string;
prefix: string;
metadata: Record<string, unknown>;
workspace_id: string | null;
workspace_name: string | null;
}{
object: "list";
data: {
id: string;
name: string;
metadata: Record<string, unknown>;
workspace_id: string | null;
workspace_name: string | null;
keyPrefix: string;
lastUsedAt: string | null;
createdAt: string;
}[];
}
// Secrets are never returnedReturns JSON confirmation. The key is immediately revoked.
{ deleted: true; id: string }Workspaces
Workspaces let one Tokaroo account contain named sub-areas such as customers, teams, or environments. API keys, provider keys, local endpoints, usage, and stats can all be bound to a workspace. These routes require a dashboard session.
// Request
{
workspace_id: string;
name: string;
metadata?: Record<string, unknown>;
monthly_budget_usd?: number | null;
}
// Response
{
id: string;
workspace_id: string;
name: string;
status: string;
metadata: Record<string, unknown>;
monthly_budget_usd: number | null;
}{
object: "list";
data: {
id: string;
workspace_id: string;
name: string;
status: string;
metadata: Record<string, unknown>;
monthly_budget_usd: number | null;
created_at: string;
updated_at: string;
}[];
}// Request
{
name?: string;
status?: string;
metadata?: Record<string, unknown>;
monthly_budget_usd?: number | null;
}
// Response
{
id: string;
workspace_id: string;
name: string;
status: string;
metadata: Record<string, unknown>;
monthly_budget_usd: number | null;
updated_at: string;
}Provider keys
Provider keys are optional and advanced. They are primarily used for BYOK routing, free-tier arbitrage, and max-mode baseline personalization. These routes require a dashboard session.
// Request
{
provider: "openai" | "anthropic" | "google" | "groq";
label?: string;
secret: string; // stored encrypted, never returned
workspace_id?: string;
}
// Response
{
created: true;
provider: string;
label: string;
workspace_id: string | null;
}Optional query param: workspace_id.
{
object: "list";
data: {
id: string;
provider: string;
label: string;
workspace_id: string | null;
workspace_name: string | null;
status: string;
last_success_at: string | null;
last_failure_at: string | null;
last_failure_code: string | null;
estimated_reset_at: string | null;
}[];
}Returns JSON confirmation.
{ deleted: true; id: string }Manage provider keys in the dashboard. Secrets are stored encrypted and never returned by any API response.
Usage history
Query params: limit (default 100, max 1000), before (cursor — created_at ISO string), api_key_id, workspace_id, request_type.
interface UsageRow {
id: string;
api_key_id: string | null;
workspace_id: string | null;
workspace_name: string | null;
metadata: Record<string, unknown>;
tier: "auto" | "fast" | "max";
model: "auto" | "fast" | "max";
provider: string;
routing_method: string;
execution_source: string;
billing_mode: string;
request_type: string;
input_tokens: number;
output_tokens: number;
actual_cost_usd: number;
charged_usd: number;
baseline_usd: number;
latency_ms: number | null;
created_at: string;
}
// Response: { data: UsageRow[], has_more: boolean }Tokaroo never exposes which model or provider handled your request. What you see is what matters: tokens, cost, latency, and tier.
Grouped breakdown
Query params: group_by (api_key or workspace), days, api_key_id, workspace_id, request_type.
{
object: "list";
group_by: "api_key" | "workspace";
window_days: number;
data: Record<string, unknown>[];
}Stats
Summary analytics for the whole account or a filtered workspace/key scope. Query params: days, daily_days, api_key_id, workspace_id, request_type.
{
window_days: number;
daily_window_days: number;
api_key_id: string | null;
workspace_id: string | null;
requests: number;
total_tokens: number;
avg_latency_ms: number;
charged_usd: number;
actual_cost_usd: number;
tier_breakdown: Record<"auto" | "fast" | "max", number>;
request_type_breakdown: Record<string, number>;
daily: {
date: string;
requests: number;
tokens: number;
charged_usd: number;
actual_cost_usd: number;
}[];
}Memory Vault API
Memory Vault is the structured memory engine inside Knowledge Base. Use it for durable notes, entities, graph links, event history, batch writes, and context packs.
| Route | Purpose |
|---|---|
| GET /v1/memory/notes | List durable memory notes. Filter by workspace, status, kind, and limit. |
| POST /v1/memory/notes | Create or upsert a memory note with title, content, kind, tags, importance, confidence, metadata, and optional external_source/external_id. |
| PATCH /v1/memory/notes/:id | Edit note content, status, tags, metadata, importance, confidence, or external identity. |
| POST /v1/memory/notes/:id/approve | Approve a pending/inferred memory. |
| DELETE /v1/memory/notes/:id | Forget a memory without destroying audit history. |
| GET/POST/PATCH /v1/memory/entities | Read and write graph entities such as users, workspaces, projects, tools, companies, and missions. |
| GET/POST/DELETE /v1/memory/links | Read and write relationships between notes and entities. |
| GET/POST /v1/memory/events | Record summarized history such as onboarding turns, mission lifecycle events, tool outcomes, and memory-control actions. |
| POST /v1/memory/batch | Write notes, entities, links, events, and feedback together with stable external IDs. |
| POST /v1/memory/context-pack | Return budgeted, ranked context with memories, entities, links, events, sources, citations, and telemetry. |
| POST /v1/memory/context | Compatibility endpoint for simpler context retrieval. |
// POST /v1/memory/context-pack
{
query: string;
workspace_id?: string;
max_tokens?: number;
include_sources?: boolean;
metadata?: {
mission_id?: string;
gwen_task_id?: string;
tokaroo_mission_uuid?: string;
};
}Context-pack telemetry records which memories, sources, and links were included. Feedback can later mark those items helpful or unhelpful so ranking improves over time.
Action guardrails
Action guardrails let agents check risky tool calls before execution. Apps can model actions by app, mission, task, actor, tool, operation, access level, risk level, and requested spend.
| Route | Purpose |
|---|---|
| GET /v1/actions/policies | List allow, approval, or deny policies. |
| POST /v1/actions/policies | Create or upsert a policy for a tool/action/risk/spend scope. |
| POST /v1/actions/check | Ask Tokaroo whether an action is allowed, denied, or requires approval before a tool runs. |
| GET /v1/actions/approvals | List pending/decided approvals for a workspace or mission. |
| POST /v1/actions/approvals/:id/decide | Approve or reject a pending action approval. |
| GET /v1/actions/audit | Read action audit events. |
| POST /v1/actions/audit | Record executed, failed, blocked, approved, or rejected action events. |
// POST /v1/actions/check
{
app: "gwendolyn";
mission_id?: string; // parent mission/external goal id
task_id?: string; // child work-unit id inside the mission
actor_ref?: string;
tool_ref: string;
action_type: "read" | "write" | "spend" | "communication" | "code" | string;
operation?: string;
access_level?: "read_only" | "write" | "destructive" | string;
risk_level?: "low" | "medium" | "high" | "critical";
requested_amount_usd?: number;
external_source?: string;
external_id?: string;
metadata?: Record<string, unknown>;
}Meta-harness
Tokaroo records the harness around the model: retrieved documents, retrieved memories, context assembly, model call usage, feedback, and reflections. This is the learning layer that lets agents improve without fine-tuning a model first.
A mission is the parent goal. A task is a child work unit inside that mission. Use mission_id for the stable external mission id, use task_id for a Gwen task or similar unit, and use the returned Tokaroo mission UUID only in the /v1/harness/missions/:id URL path.
| Route | Purpose |
|---|---|
| GET /v1/harness/versions | List active/draft harness configs for context, retrieval, and reflection behavior. |
| POST /v1/harness/versions | Create a candidate harness config for later evals and rollout. |
| GET /v1/harness/missions | List agent missions mirrored into Tokaroo with status, progress, app, workspace, and outcome fields. |
| POST /v1/harness/missions | Create or upsert a mission using stable mission_id or external_source/external_id. |
| GET /v1/harness/missions/:id | Read mission detail including steps, artifacts, approvals, action audit, usage, and traces. |
| PATCH /v1/harness/missions/:id | Update mission progress, status, outcome, success, risk, or metadata. |
| POST /v1/harness/missions/:id/steps | Create or upsert a mission ledger step. |
| PATCH /v1/harness/missions/:id/steps/:step_id | Update a ledger step. |
| POST /v1/harness/missions/:id/artifacts | Attach produced outputs such as emails, files, docs, URLs, receipts, plans, reports, or code summaries. |
| GET /v1/harness/traces | Read the observation stream: inputs, outputs, retrieved context, costs, outcomes. |
| POST /v1/harness/traces | Record an external observation from an agent/app such as Gwen. |
| POST /v1/harness/feedback | Attach correction, rating, remember, or forget signals to a trace. |
| GET /v1/harness/reflections | Read higher-level insights synthesized from traces. |
| POST /v1/harness/reflections/generate | Generate a reflection from selected traces and optionally store it as memory. |
// Mission/task identity model
{
// POST /v1/harness/missions
mission_id: "mission_123"; // parent goal
external_source: "gwendolyn";
external_id: "mission:mission_123";
}
{
// POST /v1/harness/missions/:tokaroo_mission_uuid/steps
step_ref: "task_456"; // or "task_456:step_1"
task_id: "task_456";
metadata: {
mission_id: "mission_123",
task_id: "task_456"
}
}interface HarnessTrace {
id: string;
trace_type: "knowledge_reply" | "docs_query" | "external_observation" | string;
harness_version_id: string | null;
assistant_id: string | null;
thread_id: string | null;
input_summary: string | null;
output_summary: string | null;
context_strategy: string;
context: {
documents?: Array<{ id: string; score: number }>;
memories?: Array<{
id: string;
score: number;
relevance_score: number;
recency_score: number;
importance_score: number;
confidence_score: number;
}>;
};
charged_usd: number;
outcome: "unknown" | "success" | "failure" | string;
feedback_score: number | null;
}curl https://api.tokaroo.com/v1/harness/feedback \
-H "Authorization: Bearer tok_..." \
-H "Content-Type: application/json" \
-d '{
"trace_id": "trace_uuid",
"rating": 1,
"outcome": "success",
"remember_text": "User prefers concise answers with concrete next steps."
}'Knowledge replies and Docs project chat automatically create harness traces. Gwen and other agents can also write traces directly, then send feedback so Tokaroo can improve retrieval, context assembly, routing, and future memory selection.
Error codes
Errors are JSON objects with an error field. Some routes also include a human-readable message and route-specific metadata.
interface ErrorResponse {
error: string;
message?: string;
payment_url?: string;
balance_usd?: number;
required_usd?: number;
}| HTTP | Code | Meaning | Fix |
|---|---|---|---|
| 400 | varies | Malformed request body, unsupported values, or scope mismatch | Check required fields, filters, and request types. |
| 401 | Missing Authorization header / Invalid or expired credential | Missing or invalid credential | Check the Authorization header and credential type. |
| 402 | billing_required / payment_failed / no_card | No active balance or a payment step failed | Complete payment setup in the dashboard. |
| 403 | Session authentication required ... | Management route requires dashboard session or the requested scope is forbidden | Use a dashboard session or narrow the request scope. |
| 404 | Workspace not found / API key not found | Requested resource does not exist in this account | Check the resource id or workspace_id. |
| 429 | Rate limit exceeded | Too many requests on this API key | Back off and retry with exponential backoff. |
| 503 | Provider ... not configured / No ... provider configured | The required provider path was unavailable | Retry later or use a different route/mode. |
| 500 | varies | Something went wrong on our end | Retry with exponential backoff. |
The exact error string is not fully normalized across every route yet, so match on HTTP status first and then inspect the error payload where needed.
Rate limits
| Limit | Value |
|---|---|
| Requests per minute (per API key) | Default 60, configurable per key |
| Dashboard/session traffic | Higher internal allowance; intended for management, not runtime inference |
Rate limit headers are returned on every response:
| Header | Description |
|---|---|
| x-ratelimit-limit | Maximum requests per minute |
| x-ratelimit-remaining | Requests remaining in the current window |
When you hit the limit you receive a 429 response with error: "Rate limit exceeded". Back off and retry later.