AI Gateway - Memory and model routing for any AI agent

Mastra's AI gateway handles memory, model selection, and failover automatically. Access 300+ models from every major provider with transparent pricing and no token markup.

Give any agent human-like memory

Compress chat history into compact memories that help agents remember what matters. Lower token usage and latency without losing important context.

Route all traffic through one gateway

Use one endpoint for every model. Skip provider SDKs — one API key covers everything. Mastra's open-source AI gateway gives you a single integration point that routes to all major providers.

Start building with a free account, or explore our pricing for more

Full Pricing

Starter

Free for everyone

$0/ month
  • 100K observability events+ $10/100K
  • 24 CPU hours+ $0.35/hr
  • 15 days of data retention
  • Unlimited users, deployments, and projects

Teams

For growing teams

$250/ month
  • 1M observability events+ $8/100K
  • 250 CPU hours+ $0.25/hr
  • 6 months of data retention
  • Multiple teams, SSO, and SOC 2 docs

Enterprise

For teams at scale

Custom pricing

Custom volume and retention, with RBAC, audit logs, support and uptime SLAs, and a dedicated support engineer.

Mastra is powering the best AI teams

Case Studies

Frequently asked questions

What is AI Gateway?

Mastra's AI gateway adds observational memory and unified model routing to any agent. Use one endpoint for every model and one API key for every provider. Observational memory forms dense observations in the background without interrupting the conversation.

How does AI Gateway prevent agents from losing context?

AI Gateway prevents context loss by running two background agents that maintain a dense observation log without blocking the conversation or throwing anything away. An Observer watches conversations and compresses message history into concise notes about what happened. A Reflector condenses observations when they grow too long. Compression is typically 5-40x.

How does prompt caching work in AI Gateway?

AI Gateway appends observations over time rather than rebuilding the prompt each turn. Keeping the prompt prefix stable and cacheable means the longer the conversation runs, the more you save.

How does AI Gateway route traffic across model providers?

AI Gateway is the single integration point for every model provider your agents need to reach. One API key covers everything across more than 300 models including OpenAI, Anthropic, Google, Meta and Mistral, with no provider SDKs required. Change your base URL and start routing through the gateway using Python, TypeScript, any framework, any client. Plug in your own provider API keys for direct routing to any provider and get the same memory features either way. Run on distributed infrastructure with automatic failover across providers.

Does AI Gateway work with agents not built on Mastra?

AI Gateway works with any agent stack. Change your base URL and start routing through Mastra's open-source AI gateway using Python, TypeScript, any framework and any client. Observational memory and model routing apply to any agent regardless of how it was built.

Start building today

Route all your LLM traffic through one gateway built for production agents