AgentGateway Playbook

What's served, how to call it, and how routes map to backends

← Back to portal
You are not signed in. The model list below is fetched live with each user's own API key — sign in to see what's available to you specifically.

Models served

Sign in with an account holding the inference-consumer role and provision a key to see the live model list. Until then, here's the route inventory below.

Routes

MethodPathBackendNotes
POST /v1/chat/completions qwen3-35b Primary chat model. Model override: Qwen3.6-35B-A3B.
POST /v1/gemma/chat/completions gemma-4-26b Gemma chat. Model override: gemma-4-26B-A4B-it.
POST /v1/completions qwen3-35b Legacy text-completions endpoint.
POST /v1/embeddings qwen3-embedding-4b (50%) + f2llm-v2-4b (50%) Embeddings traffic is split 50/50 across the two embedding models.
GET /v1/models models-proxy Lists every model the gateway can route to. Bypasses LLM validation.
POST /v1/cloud/openai/* openai Direct passthrough to OpenAI (external).
POST /v1/cloud/claude/* anthropic Direct passthrough to Anthropic Claude (external).

Quick start (cURL)

List models

curl https://agentgateway.dev.drai.auckland.ac.nz/v1/models \ -H "Authorization: Bearer YOUR_API_KEY"

Chat completion (default: Qwen3-35B)

curl https://agentgateway.dev.drai.auckland.ac.nz/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "default", "messages": [{"role": "user", "content": "Hello"}] }'

Chat completion via Gemma

curl https://agentgateway.dev.drai.auckland.ac.nz/v1/gemma/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "default", "messages": [{"role": "user", "content": "Hello"}] }'

Embeddings (split 50/50 across two embedding models)

curl https://agentgateway.dev.drai.auckland.ac.nz/v1/embeddings \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"model": "default", "input": "Embed this text."}'

MCP servers & agents

AgentGateway does not yet expose MCP servers or agents on this cluster. When those backends are added (see agentgateway/backends.yaml), they will appear here automatically.