Skip to main content

Overview

The API gateway handles the authentication, caching, routing of the incomming request. The gateway is an extended version of TensorZero with modifications to mange auth, routing, bud stack integration etc. TensorZero has integrated Redis for dynamic model configuration and API key-based authentication, enabling multi-tenant support and runtime configuration updates without service restarts. This integration allows the system to serve different models to different users based on their API keys. Gateway Architecture

Architecture Components

1. Redis Client

The Redis client manages real-time synchronization of model configurations and API keys:

Key Patterns

  • model_table:{model_id} - Stores model provider configurations
  • api_key:{api_key} - Stores API key to model mappings

Event Subscriptions

  • __keyevent@*__:set - Captures new/updated keys
  • __keyevent@*__:del - Captures deleted keys
  • __keyevent@*__:expired - Captures expired keys

Initialization Flow

  1. On startup, fetches all existing model_table:* and api_key:* keys
  2. Subscribes to keyspace notifications for real-time updates
  3. Maintains connection with automatic reconnection logic

2. Authentication Middleware

The authentication system validates API keys and maps user-facing model names to internal model/endpoint IDs:

Request Flow

  1. Extracts API key from Authorization header (supports “Bearer” prefix)
  2. Validates key exists in the auth state
  3. Retrieves model mapping for the API key
  4. Modifies request body: replaces model field with tensorzero::model_name::{model_id}
  5. Passes modified request to downstream handlers

Key Features

  • Thread-safe with RwLock for concurrent access
  • Dynamic updates without service restart
  • Graceful handling of missing keys

3. In-Memory Model Table

Manages model configurations and routing:

Model Configuration Structure

{
  "model-id": {
    "routing": ["provider1", "provider2"],
    "providers": {
      "provider1": {
        "type": "vllm",
        "model_name": "actual-model-name",
        "api_base": "http://endpoint",
        "api_key_location": "none"
      }
    }
  }
}

Validation

  • Prevents use of reserved prefixes (tensorzero::)
  • Validates model exists before returning configuration

4. OpenAI-Compatible Endpoint (tensorzero-internal/src/endpoints/openai_compatible.rs)

Handles the model resolution after authentication:

Model Name Parsing

  • Detects tensorzero::model_name::{model_id} pattern
  • Extracts model ID for table lookup
  • Falls back to function name resolution if not a model reference

Performance

Performance benchmark of LLaMa3.2 1B using Bud, VLLM and Aibrix. Gateway Performance

Data Flow

1. Configuration Update Flow

Redis SET → Keyspace Event → Redis Client → Update In-Memory State
  1. External system sets key in Redis (e.g., api_key:xyz or model_table:abc)
  2. Redis publishes keyspace notification
  3. Redis client receives event and fetches updated value
  4. Updates appropriate in-memory state (Auth or ModelTable)

2. Request Authentication Flow

Client Request → Auth Middleware → Model Resolution → Provider
  1. Client sends request with API key in Authorization header
  2. Auth middleware validates key and retrieves model mapping
  3. Modifies request to include internal model ID
  4. OpenAI endpoint resolves model ID to provider configuration
  5. Routes request to appropriate provider

Redis Data Structures

API Key Structure

{
  "api_key_here": {
    "user-facing-model-name": "internal-endpoint-id",
    "another-model": "another-endpoint-id"
  }
}

Model Table Structure

{
  "endpoint-id": {
    "routing": ["primary-provider", "fallback-provider"],
    "providers": {
      "primary-provider": {
        "type": "provider-type",
        "model_name": "actual-model-name",
        "api_base": "http://provider-endpoint",
        "api_key_location": "header|none"
      }
    }
  }
}

Configuration

Environment Variables

  • TENSORZERO_REDIS_URL - Redis connection string (e.g., redis://default:password@localhost:6379)

Benefits

  1. Dynamic Configuration: Add/remove models and API keys without restarting the service
  2. Multi-Tenancy: Different API keys can access different models
  3. Real-time Updates: Changes take effect immediately via Redis pub/sub
  4. Scalability: Multiple gateway instances can share the same Redis backend
  5. Isolation: Each tenant’s model access is isolated by their API key

Troubleshooting

Common Issues

  1. Redis Connection Failed
    • Check TENSORZERO_REDIS_URL is correctly formatted
    • Verify Redis is running and accessible
    • Check network connectivity
  2. API Key Not Working
    • Verify key exists in Redis with correct format
    • Check model ID mapping is valid
    • Ensure Redis events are enabled (notify-keyspace-events config)
  3. Model Not Found
    • Verify model table entry exists in Redis
    • Check model ID matches between API key mapping and model table
    • Validate JSON structure in Redis

Debug Commands

# Check Redis keys
redis-cli KEYS "api_key:*"
redis-cli KEYS "model_table:*"

# Get specific key value
redis-cli GET "api_key:your_key_here"

# Monitor Redis events
redis-cli PSUBSCRIBE "__keyevent@*__:*"