MATIH Platform is in active MVP development. Documentation reflects current implementation status.
2. Architecture
API Design
Rate Limiting

Rate Limiting

The MATIH Platform enforces per-tenant rate limiting to protect shared infrastructure from abuse and ensure fair resource distribution across tenants. Rate limits are applied at the Kong API Gateway and enforced by the billing service tier configuration.


Rate Limiting Strategy

Rate limits are applied at two levels:

LevelEnforcement PointScope
API GatewayKong rate-limit pluginPer-tenant, per-endpoint
Service levelApplication middlewarePer-tenant, per-resource

Rate Limits by Tier

TierRequests per MinuteRequests per HourBurst Limit
Free601,00010
Professional30010,00050
Enterprise1,000100,000200

Rate Limit Headers

Every API response includes rate limit headers:

HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the current window
X-RateLimit-RemainingRemaining requests in the current window
X-RateLimit-ResetUnix timestamp when the current window resets
Retry-AfterSeconds to wait before retrying (only on 429 responses)

Rate Limit Response

When a rate limit is exceeded, the API returns a 429 status:

{
  "success": false,
  "error": {
    "code": "RATE_LIMITED",
    "category": "CLIENT_ERROR",
    "message": "Rate limit exceeded. Try again in 30 seconds.",
    "details": {
      "limit": 300,
      "remaining": 0,
      "resetAt": "2026-02-12T10:31:00Z",
      "retryAfter": 30
    }
  }
}

Per-Endpoint Rate Limits

Some endpoints have specific rate limits in addition to the global tenant limit:

Endpoint CategoryAdditional LimitRationale
Authentication (/auth/login)10 per minute per IPBrute-force prevention
AI chat (/ai/chat)30 per minuteLLM resource protection
Query execution (/query/execute)100 per minuteDatabase load protection
Export (/bi/export)10 per minuteRendering resource protection
Bulk operations (/*/bulk)5 per minuteResource-intensive operations

Implementation

Gateway Level (Kong)

The Kong rate-limit plugin uses Redis as a backing store for rate limit counters:

Request arrives at Kong Gateway
  |
  v
Extract tenant_id from JWT claims
  |
  v
Check Redis counter: rate_limit:{tenant_id}:{window}
  |
  +-- Counter < limit --> Increment counter, forward request
  |
  +-- Counter >= limit --> Return 429 Too Many Requests

Service Level

Services can enforce additional rate limits using the RateLimiter from commons-java:

@RateLimited(
    key = "tenant_id",
    limit = 30,
    window = 60,
    unit = TimeUnit.SECONDS
)
@PostMapping("/api/v1/ai/chat")
public ResponseEntity<ChatResponse> chat(@RequestBody ChatRequest request) {
    // Handler code
}

Quota Management

The billing service tracks cumulative usage against monthly quotas:

MetricFree TierProfessionalEnterprise
AI tokens per month10,000500,000Unlimited
Queries per month1,00050,000Unlimited
Storage (GB)150Custom
API calls per month10,000100,000Unlimited

When a quota is reached, the service returns a 402 Payment Required response with instructions to upgrade.


Monitoring

Rate limiting metrics are exposed through Prometheus:

MetricTypeDescription
kong_rate_limit_totalCounterTotal rate limit checks
kong_rate_limit_exceeded_totalCounterTotal rate limit rejections
matih_tenant_api_calls_totalCounterPer-tenant API call count

Alerts are configured for:

  • Sustained rate limit rejections (indicates a tenant may need a higher tier)
  • Unusual spike in API calls (potential abuse or misconfigured client)

Related Pages