AgentGateway Playbook — DRAI Portal

You are not signed in. The model list below is fetched live with each user's own API key — sign in to see what's available to you specifically.

Sign in with Keycloak

Models served

Sign in with an account holding the inference-consumer role and provision a key to see the live model list. Until then, here's the route inventory below.

Routes

Method	Path	Backend	Notes
POST	/v1/chat/completions	qwen3-35b	Primary chat model. Model override: Qwen3.6-35B-A3B.
POST	/v1/gemma/chat/completions	gemma-4-26b	Gemma chat. Model override: gemma-4-26B-A4B-it.
POST	/v1/completions	qwen3-35b	Legacy text-completions endpoint.
POST	/v1/embeddings	qwen3-embedding-4b (50%) + f2llm-v2-4b (50%)	Embeddings traffic is split 50/50 across the two embedding models.
GET	/v1/models	models-proxy	Lists every model the gateway can route to. Bypasses LLM validation.
POST	/v1/cloud/openai/*	openai	Direct passthrough to OpenAI (external).
POST	/v1/cloud/claude/*	anthropic	Direct passthrough to Anthropic Claude (external).

Quick start (cURL)

List models

curl https://agentgateway.dev.drai.auckland.ac.nz/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Chat completion (default: Qwen3-35B)

curl https://agentgateway.dev.drai.auckland.ac.nz/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "default",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Chat completion via Gemma

curl https://agentgateway.dev.drai.auckland.ac.nz/v1/gemma/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "default",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Embeddings (split 50/50 across two embedding models)

curl https://agentgateway.dev.drai.auckland.ac.nz/v1/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "default", "input": "Embed this text."}'

MCP servers & agents

AgentGateway does not yet expose MCP servers or agents on this cluster. When those backends are added (see agentgateway/backends.yaml), they will appear here automatically.