Per-Tenant Ingress

Each tenant in the MATIH platform receives a dedicated NGINX ingress controller with its own LoadBalancer IP address. This architecture provides complete network isolation between tenants at the ingress layer, enables per-tenant rate limiting, and supports tenant-specific TLS certificates and custom domains.

Ingress Architecture

Internet
    |
    v
Azure Load Balancer (20.85.123.45)
    |
    v
NGINX Ingress Controller (matih-acme namespace)
    |
    +--- Host: acme.matih.ai/api/* ---------> api-gateway:8080
    +--- Host: bi.acme.matih.ai/* -----------> bi-workbench:3000
    +--- Host: acme.matih.ai/ws/* -----------> ai-service:8000
    +--- Host: data.acme.matih.ai/* ---------> data-workbench:3002

Key Components

Component	Location	Description
`TenantIngressService`	`control-plane/tenant-service/.../service/TenantIngressService.java`	Ingress controller deployment and management
`IngressProvisioner`	`control-plane/tenant-service/.../provisioning/IngressProvisioner.java`	Provisioning phase integration
`RetryableHelmService`	`control-plane/tenant-service/.../service/helm/RetryableHelmService.java`	Helm operations with retry logic
NGINX values template	`infrastructure/helm/ingress-nginx/values-tenant.yaml`	Per-tenant NGINX configuration

Dedicated Ingress Controller Deployment

The TenantIngressService.deployIngressController() method deploys a dedicated NGINX ingress controller per tenant using Helm:

Deployment Configuration

# Per-tenant NGINX ingress controller values
controller:
  ingressClassResource:
    name: nginx-acme                              # Unique IngressClass per tenant
    controllerValue: k8s.io/ingress-nginx-acme    # Unique controller identifier
  ingressClass: nginx-acme
  replicaCount: 2                                  # Based on tenant tier
  service:
    type: LoadBalancer
    annotations:
      service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
  admissionWebhooks:
    enabled: false                                 # Disabled for tenant controllers
  resources:
    requests:
      cpu: 100m
      memory: 128Mi
    limits:
      cpu: 500m
      memory: 256Mi

IngressClass Isolation

Each tenant gets a unique IngressClass name following the pattern nginx-{tenant-slug}. This prevents one tenant's ingress controller from processing another tenant's Ingress resources:

Tenant	IngressClass	Controller Value
acme	`nginx-acme`	`k8s.io/ingress-nginx-acme`
beta	`nginx-beta`	`k8s.io/ingress-nginx-beta`
gamma	`nginx-gamma`	`k8s.io/ingress-nginx-gamma`

Replica Scaling by Tier

The number of ingress controller replicas is determined by the tenant tier:

Tier	Replicas	Rationale
Free	1	Cost optimization, acceptable single-point risk
Professional	2	High availability with rolling updates
Enterprise	3	Full redundancy across availability zones

The method getTenantIngressReplicas(tenant) in TenantIngressService determines the replica count based on the tenant's tier.

LoadBalancer IP Assignment

After deploying the ingress controller, the TenantIngressService polls the Kubernetes API for the LoadBalancer's external IP:

Polling Logic

public String waitForLoadBalancerIp(String namespace, int maxWaitSeconds) {
    int pollInterval = properties.getIngress()
            .getLoadBalancerPollIntervalSeconds();
    int maxAttempts = maxWaitSeconds / pollInterval;
 
    for (int attempt = 1; attempt <= maxAttempts; attempt++) {
        ServiceList services = kubernetesClient.services()
                .inNamespace(namespace)
                .withLabel("app.kubernetes.io/name", "ingress-nginx")
                .list();
 
        for (Service svc : services.getItems()) {
            List<LoadBalancerIngress> ingresses =
                svc.getStatus().getLoadBalancer().getIngress();
            if (ingresses != null && !ingresses.isEmpty()) {
                String ip = ingresses.get(0).getIp();
                if (ip != null && !ip.isBlank()) {
                    return ip;
                }
            }
        }
 
        Thread.sleep(pollInterval * 1000L);
    }
 
    throw new RuntimeException(
        "LoadBalancer IP not assigned within " + maxWaitSeconds + " seconds");
}

Timeout Configuration

Property	Default	Description
`matih.azure.ingress.load-balancer-poll-interval-seconds`	10	Polling interval
`matih.azure.ingress.load-balancer-max-wait-seconds`	600	Maximum wait time

Ingress Resource Creation

After the ingress controller is running and has an IP, the service creates Kubernetes Ingress resources that define routing rules.

Standard Routing Rules

Host Pattern	Path	Backend Service	Port
`acme.matih.ai`	`/api/*`	api-gateway	8080
`acme.matih.ai`	`/ws/*`	ai-service	8000
`bi.acme.matih.ai`	`/*`	bi-workbench	3000
`data.acme.matih.ai`	`/*`	data-workbench	3002
`ml.acme.matih.ai`	`/*`	ml-workbench	3001
`agentic.acme.matih.ai`	`/*`	agentic-workbench	3003

TLS Configuration

TLS is terminated at the ingress controller using certificates issued by cert-manager:

spec:
  tls:
    - hosts:
        - acme.matih.ai
        - "*.acme.matih.ai"
      secretName: acme-tls-certificate

The TLS certificate covers both the root domain and all subdomains via a wildcard SAN.

NGINX Configuration Tuning

The per-tenant NGINX ingress controller is configured with production-grade settings:

Timeouts and Buffers

Setting	Value	Description
`proxy-read-timeout`	300s	For long-running queries
`proxy-send-timeout`	300s	For large result sets
`proxy-body-size`	50m	For file uploads
`proxy-buffer-size`	16k	For large headers (JWT tokens)
`keepalive-timeout`	75s	Connection reuse

WebSocket Support

The AI service uses WebSocket connections for streaming responses. The ingress controller is configured to support WebSocket upgrades:

annotations:
  nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
  nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
  nginx.ingress.kubernetes.io/upstream-hash-by: "$remote_addr"

Rate Limiting

Per-tenant rate limiting is configured at the ingress level:

Tier	Rate Limit	Burst
Free	100 req/s	200
Professional	500 req/s	1000
Enterprise	Custom	Custom

Ingress Health Monitoring

The ingress controller exposes health endpoints that are monitored by the tenant monitoring stack:

Endpoint	Purpose
`/healthz`	Liveness probe
`/metrics`	Prometheus metrics
`/nginx_status`	NGINX stub status

Key Metrics

Metric	Description
`nginx_ingress_controller_requests`	Total requests by status code
`nginx_ingress_controller_response_duration_seconds`	Request latency histogram
`nginx_ingress_controller_nginx_process_connections`	Active connections
`nginx_ingress_controller_ssl_certificate_expiry`	Certificate expiry time

Cleanup on Tenant Deletion

When a tenant is deprovisioned, the ingress resources are cleaned up in order:

Delete Ingress resources (routing rules)
Delete TLS certificate and secret
Uninstall NGINX ingress controller Helm release
Azure releases the LoadBalancer IP
Delete IngressClass resource

Dev Environment Configuration

In development environments, per-tenant ingress is typically disabled to reduce resource consumption:

Aspect	Dev	Production
Dedicated ingress	Disabled	Enabled
Access method	`kubectl port-forward` or shared ingress	Dedicated LoadBalancer
TLS	Self-signed or disabled	Let's Encrypt production
IngressClass	Default `nginx`	Per-tenant `nginx-{slug}`

Troubleshooting

Issue	Diagnostic	Resolution
No external IP	Check Azure LB quota	Request quota increase
502 Bad Gateway	Check backend pod readiness	Verify service health
TLS certificate error	Check cert-manager logs	Verify DNS-01 challenge resolution
Routing 404	Check IngressClass match	Verify `ingressClassName` matches controller
Connection timeout	Check NGINX timeout settings	Increase proxy timeouts

Validation Script

./scripts/tools/validate-tenant-ingress.sh --tenant acme

This script validates:

Ingress controller pods are running
LoadBalancer IP is assigned
Ingress resources exist with correct rules
TLS certificate is valid and not expiring
Backend services are reachable through the ingress

Next Steps

DNS Zone Management -- DNS records that point to the ingress
Billing Integration -- how ingress usage affects billing
API Reference -- ingress management endpoints

DNS Management Provisioning Phases