API Gateway
The API Gateway is the single point of entry for all external traffic into the MATIH platform. Built on Kong 3.5.0, it handles request routing, JWT authentication, tenant context injection, rate limiting, and security filtering. Every request from a browser, CLI tool, or external API client passes through this gateway before reaching any backend service.
Gateway Architecture
+---------------------------+
| Internet |
+------------+--------------+
|
+------------v--------------+
| Load Balancer (L4) |
+------------+--------------+
|
+------------v--------------+
| Kong API Gateway |
| (Port 8080) |
| |
| +---------------------+ |
| | Custom Lua Plugins | |
| | - JWT Extraction | |
| | - Tenant Rate Limit | |
| | - Request Validation| |
| +---------------------+ |
| |
+---+----+----+----+----+---+
| | | | |
v v v v v
Backend Services (REST APIs)Kong 3.5.0 Configuration
The gateway is deployed via Helm chart with the following configuration:
| Parameter | Value | Purpose |
|---|---|---|
| Database mode | DB-less (declarative) | Configuration via YAML, no external database dependency |
| Proxy port | 8080 | Main traffic entry point |
| Admin port | 8444 (internal only) | Management API (not exposed externally) |
| Status port | 8100 | Health check endpoint |
| Worker count | Auto (based on CPU cores) | NGINX worker processes |
| SSL termination | At load balancer or ingress | Gateway handles HTTP internally |
Declarative Configuration
Kong runs in DB-less mode, meaning all routing and plugin configuration is declared in a YAML file rather than stored in a database. This approach provides:
- Immutability: Configuration is version-controlled alongside application code
- Reproducibility: Every environment gets the exact same gateway configuration
- Speed: No database dependency means faster startup and failover
- GitOps compatibility: Configuration changes go through the same PR review process as code
Custom Lua Plugins
The gateway uses three custom Lua plugins that extend Kong's built-in functionality:
1. JWT Claims Extraction Plugin
This plugin extracts tenant_id, user_id, and roles from the validated JWT token and injects them as HTTP headers for downstream services:
Behavior:
Incoming Request:
Authorization: Bearer eyJhbGci...
After Plugin Execution:
Authorization: Bearer eyJhbGci... (preserved)
X-Tenant-ID: acme-corp (extracted from JWT)
X-User-ID: user-123 (extracted from JWT)
X-User-Roles: data_analyst,viewer (extracted from JWT)
X-Request-ID: req-abc-456 (generated if missing)
X-Correlation-ID: cor-def-789 (generated if missing)Why a custom plugin? Kong's built-in JWT plugin validates tokens but does not extract custom claims into headers. The MATIH platform needs tenant_id propagated as a header so that backend services can establish tenant context without re-parsing the JWT.
Processing Steps:
- Validate the JWT signature against the signing key from IAM service
- Check token expiration and issuer claims
- Extract
tenant_idfrom thetenant_idclaim - Extract
sub(user ID) from the standard subject claim - Extract
rolesfrom therolesclaim - Set headers:
X-Tenant-ID,X-User-ID,X-User-Roles - Generate
X-Request-IDandX-Correlation-IDif not present in the request
2. Tenant Rate Limiting Plugin
This plugin enforces per-tenant rate limits that are configurable through the config-service:
Rate Limit Tiers:
| Tenant Tier | Requests/Minute | Requests/Hour | Burst Limit |
|---|---|---|---|
| Free | 60 | 1,000 | 10 |
| Professional | 600 | 20,000 | 100 |
| Enterprise | 6,000 | 200,000 | 1,000 |
| Custom | Configurable | Configurable | Configurable |
Rate Limit Headers:
The plugin adds standard rate limit headers to every response:
X-RateLimit-Limit: 600 # Total allowed requests
X-RateLimit-Remaining: 542 # Remaining requests in window
X-RateLimit-Reset: 1707700800 # Window reset timestamp (epoch)When a tenant exceeds their limit, the gateway returns:
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 45
{
"error": "Rate limit exceeded",
"message": "Tenant 'acme-corp' has exceeded 600 requests/minute limit",
"retryAfter": 45
}Implementation Details:
- Rate counters are stored in Redis with sliding window algorithm
- Counters are keyed by
ratelimit:{tenant_id}:{window} - The plugin checks tenant tier by looking up the
X-Tenant-IDheader and querying the config-service cache
3. Request Validation Plugin
This plugin performs input validation at the gateway level before requests reach backend services:
Validations Performed:
| Check | Action on Failure |
|---|---|
| Content-Type header present on POST/PUT | 400 Bad Request |
| Content-Length within limits (10MB max) | 413 Payload Too Large |
Path traversal patterns (../, %2e%2e) | 400 Bad Request |
| SQL injection patterns in query parameters | 400 Bad Request |
| Null bytes in URL or headers | 400 Bad Request |
| Required headers present (Authorization on protected routes) | 401 Unauthorized |
Routing Configuration
The gateway routes requests to backend services based on URL path prefixes:
Control Plane Routes
| Path Prefix | Upstream Service | Port | Authentication |
|---|---|---|---|
/api/v1/auth/login | iam-service | 8081 | None (public) |
/api/v1/auth/register | iam-service | 8081 | None (public) |
/api/v1/auth/refresh | iam-service | 8081 | Refresh token |
/api/v1/auth/* | iam-service | 8081 | JWT required |
/api/v1/users/* | iam-service | 8081 | JWT required |
/api/v1/tenants/* | tenant-service | 8082 | JWT required |
/api/v1/config/* | config-service | 8888 | JWT required |
/api/v1/notifications/* | notification-service | 8085 | JWT required |
/api/v1/audit/* | audit-service | 8086 | JWT required |
/api/v1/billing/* | billing-service | 8087 | JWT required |
/api/v1/observability/* | observability-api | 8088 | JWT required |
/api/v1/infrastructure/* | infrastructure-service | 8089 | JWT required |
/api/v1/registry/* | platform-registry | 8084 | JWT required |
Data Plane Routes
| Path Prefix | Upstream Service | Port | Authentication |
|---|---|---|---|
/api/v1/query/* | query-engine | 8080 | JWT required |
/api/v1/catalog/* | catalog-service | 8086 | JWT required |
/api/v1/semantic/* | semantic-layer | 8086 | JWT required |
/api/v1/bi/* | bi-service | 8084 | JWT required |
/api/v1/ai/* | ai-service | 8000 | JWT required |
/api/v1/ai/chat/stream | ai-service | 8000 | JWT required (SSE) |
/api/v1/ml/* | ml-service | 8000 | JWT required |
/api/v1/pipelines/* | pipeline-service | 8092 | JWT required |
/api/v1/quality/* | data-quality-service | 8000 | JWT required |
/api/v1/ontology/* | ontology-service | 8101 | JWT required |
/api/v1/governance/* | governance-service | 8080 | JWT required |
/api/v1/render/* | render-service | 8098 | JWT required |
Public Routes (No Authentication)
A small set of routes are accessible without authentication:
| Path | Service | Purpose |
|---|---|---|
/api/v1/auth/login | iam-service | User login |
/api/v1/auth/register | iam-service | User registration |
/api/v1/auth/refresh | iam-service | Token refresh |
/health | Kong | Gateway health check |
/status | Kong | Gateway status |
Upstream Service Discovery
The gateway resolves upstream services using Kubernetes DNS:
# Service resolution pattern
{service-name}.{namespace}.svc.cluster.local:{port}
# Examples
iam-service.matih-control-plane.svc.cluster.local:8081
ai-service.matih-data-plane.svc.cluster.local:8000
query-engine.matih-data-plane.svc.cluster.local:8080Kong's upstream health checks verify that backend services are responsive:
| Parameter | Value |
|---|---|
| Health check interval | 10 seconds |
| Healthy threshold | 2 consecutive successes |
| Unhealthy threshold | 3 consecutive failures |
| Timeout | 5 seconds |
| Health check path | Service-specific health endpoint |
When a service is marked unhealthy, Kong removes it from the load balancer pool and stops routing traffic to it until it recovers.
Security Headers
The gateway adds security headers to all responses:
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Strict-Transport-Security: max-age=31536000; includeSubDomains
Content-Security-Policy: default-src 'self'
Referrer-Policy: strict-origin-when-cross-originThese headers are set at the gateway level as the first line of defense. Backend services in the Control Plane add complementary headers via the SecurityFilter, providing defense-in-depth.
CORS Configuration
The gateway handles Cross-Origin Resource Sharing (CORS) for all browser-based clients:
Access-Control-Allow-Origin: {configured origins}
Access-Control-Allow-Methods: GET, POST, PUT, DELETE, PATCH, OPTIONS
Access-Control-Allow-Headers: Authorization, Content-Type, X-Request-ID,
X-Correlation-ID, X-Tenant-ID
Access-Control-Max-Age: 3600
Access-Control-Allow-Credentials: trueCORS origins are environment-specific:
- Development:
http://localhost:3000throughhttp://localhost:3005 - Production: Configured per-tenant domains (e.g.,
https://acme.matih.ai)
WebSocket and SSE Support
The gateway supports two streaming protocols:
Server-Sent Events (SSE)
Used by the AI service for streaming chat responses:
Client --> GET /api/v1/ai/chat/stream
Accept: text/event-stream
Gateway:
- Validates JWT
- Extracts tenant context
- Proxies to ai-service:8000 with streaming enabled
- Disables response buffering
- Sets proxy_read_timeout to 300s for long conversationsWebSocket Upgrade
Used for real-time dashboard updates:
Client --> GET /api/v1/bi/ws
Connection: Upgrade
Upgrade: websocket
Gateway:
- Validates JWT before upgrade
- Upgrades connection
- Proxies WebSocket frames bidirectionally
- Maintains connection until client disconnects or timeoutRequest Lifecycle Through the Gateway
A typical authenticated request flows through the gateway in the following order:
1. Request arrives at Kong (port 8080)
2. [Request Validation Plugin] Validate input, check for attacks
3. [JWT Claims Extraction Plugin] Validate JWT, extract tenant context
4. [Tenant Rate Limiting Plugin] Check rate limits for tenant
5. [Route Matching] Find upstream service based on URL path
6. [Upstream Health Check] Verify target service is healthy
7. [Proxy] Forward request with injected headers
8. [Response Processing] Add security headers, strip internal headers
9. Return response to clientAverage latency added by the gateway: 2-5ms for authenticated requests, 1-2ms for public routes.
Monitoring and Observability
The gateway exports metrics via Prometheus:
| Metric | Description |
|---|---|
kong_http_requests_total | Total request count by service, method, status |
kong_request_latency_ms | Request processing time (histogram) |
kong_upstream_latency_ms | Upstream response time (histogram) |
kong_bandwidth_bytes | Request/response bandwidth |
kong_rate_limiting_total | Rate limit hits by tenant |
These metrics power Grafana dashboards that show:
- Request volume and error rates per service
- Gateway latency breakdown (plugin processing vs. upstream response)
- Rate limit utilization by tenant tier
- Top endpoints by request volume
Related Sections
- Request Lifecycle -- Full end-to-end request flow through the gateway
- Security -- Authentication and authorization architecture
- Control Plane -- The api-gateway service details
- Kubernetes -- Gateway deployment and Helm configuration