API Gateway

System deployment published

Also known as: Gateway, LLM Gateway, AI Gateway

Definition

A server that acts as an intermediary between clients and backend services, handling cross-cutting concerns like authentication, rate limiting, logging, and routing. For LLM applications, API gateways can also provide model routing, fallback logic, cost tracking, and unified interfaces across multiple providers.

What this is NOT

  • Not the model itself (gateway is infrastructure)
  • Not the serving system (gateway fronts serving)
  • Not a load balancer only (gateway does more)

Alternative Interpretations

Different communities use this term differently:

llm-practitioners

A proxy layer that sits between your application and LLM providers, providing features like provider fallback, load balancing, caching, and unified observability. Examples: LiteLLM, Portkey, Helicone.

Sources: LiteLLM documentation, Portkey documentation, AI gateway products

software-engineering

An API management component that handles routing, composition, and protocol translation for backend services. Common in microservices architectures (Kong, AWS API Gateway).

Sources: API gateway pattern documentation, Kong, AWS API Gateway documentation

Examples

  • LiteLLM providing unified API across 100+ LLM providers
  • Portkey with automatic fallback from GPT-4 to Claude
  • Helicone for logging and analytics
  • AWS API Gateway fronting a SageMaker endpoint

Counterexamples

Things that might seem like API Gateway but are not:

  • Direct API calls to OpenAI (no gateway)
  • The LLM model itself
  • A simple HTTP proxy without LLM-specific features

Relations

Implementations

Tools and frameworks that implement this concept: