Start free.
Upgrade when it matters.

The open-source layer gets you running in minutes. The Cloud layer makes it smart.

Open source
Free
Self-hosted - BYO provider keys
  • -Drop-in OpenAI SDK replacement
  • -Waterfall routing - free models first
  • -Ollama and vLLM (local models) support
  • -10+ provider adapters built in
  • -Weekly model pool auto-discovery
  • -Self-hosted, BYO API keys
  • -MIT licensed - use it anywhere
View on GitHub
Cloud
Usage-based
Dynamic per-request pricing - no seat fees
  • +Everything in OSS, hosted for you
  • +Intelligent scoring engine (the black box)
  • +Semantic cache - never pay twice for the same answer
  • +Prompt-level cost analytics
  • +Per-request usage analytics
  • +Rate limiting plus key management
  • +Model pool always up to date
  • +No servers to manage
Get your API key

Compare

FeatureOSSCloud
OpenAI-compatible APIyesyes
Groq, Gemini, Ollama supportyesyes
Waterfall / fallback routingyesyes
BYO provider API keysyes-
Smart scoring engine-yes
Semantic response cache-yes
Console + cost analytics-yes
Weekly model auto-discoveryyesyes
Hosted - no infra to run-yes

Questions

What is the difference between OSS and Cloud?
The OSS version is a self-hosted proxy you run yourself using your own provider API keys. It handles routing, fallback, and provider abstraction. The Cloud version adds our optimization engine - the part that makes real cost decisions, caches intelligently, and learns over time. That part stays closed.
What does 'BYO keys' mean?
Bring Your Own Keys. With the OSS version you add your own OpenAI, Anthropic, Groq, Gemini etc. API keys to your environment. Tokaroo routes between them. You pay the providers directly - we do not sit in the middle of the money.
How does Cloud billing work?
Cloud prices each request dynamically based on the tier you choose, the market baseline for that request, and how efficiently Tokaroo routed it. You pre-load credits, and each request deducts from your balance. No monthly seat fees - you only pay for what you use.
Can I start on OSS and switch to Cloud?
Yes. The API is identical - it is one line to point baseURL at our servers instead of yours. Your code does not change at all.
What's the 'optimization engine'?
It is the part we do not open-source. It is the scoring model that decides which provider to use per-request based on task type, cost, latency, and recent performance. Over time it gets smarter. It is also the semantic cache - which checks whether a semantically similar question was already answered, and returns the cached result instead of paying for a new generation.
Is the OSS version useful on its own?
Absolutely. Groq, Gemini Flash, and local models are free - and Tokaroo OSS will use them first. For many workloads this alone cuts AI costs dramatically. Cloud adds intelligence on top of that foundation.