Quotas
The QuotaController and QuotaManagementService enforce resource quotas for tenants. Quotas define upper limits for compute, storage, API, ML/AI, and other resource categories. Quotas can be enforced as hard limits (rejecting requests when exceeded) or soft limits (warning only).
Resource Types
Compute
| Resource Type | Description |
|---|---|
CPU_CORES | CPU core allocation |
MEMORY_GB | Memory allocation in GB |
GPU_UNITS | GPU unit allocation |
Storage
| Resource Type | Description |
|---|---|
STORAGE_GB | Primary data storage |
OBJECT_STORAGE_GB | Object/blob storage |
Data
| Resource Type | Description |
|---|---|
DATA_TRANSFER_GB | Network data transfer |
QUERY_EXECUTIONS | Query execution count |
ROWS_SCANNED | Number of rows scanned |
API
| Resource Type | Description |
|---|---|
API_CALLS | Total API call count |
CONCURRENT_CONNECTIONS | Simultaneous connections |
ML/AI
| Resource Type | Description |
|---|---|
ML_TRAINING_HOURS | ML model training time |
ML_INFERENCE_REQUESTS | ML inference request count |
AI_TOKENS | AI/LLM token consumption |
Pipeline and BI
| Resource Type | Description |
|---|---|
PIPELINE_RUNS | Data pipeline execution count |
CONNECTOR_SYNCS | Data connector sync count |
DASHBOARD_COUNT | Number of dashboards |
REPORT_EXECUTIONS | Report generation count |
ACTIVE_USERS | Concurrent active users |
ResourceQuota Entity
| Field | Type | Description |
|---|---|---|
id | UUID | Quota identifier |
tenantId | UUID | Owning tenant |
resourceType | ResourceType | Type of resource being limited |
resourceName | String | Human-readable resource name |
quotaLimit | BigDecimal | Maximum allowed usage |
currentUsage | BigDecimal | Current usage level |
unit | String | Unit of measurement |
warningThresholdPercent | BigDecimal | Warning at this utilization percentage (default: 80%) |
enforceHardLimit | boolean | Reject requests when over quota (default: true) |
alertRecipients | JSON | List of email addresses for alerts |
isActive | boolean | Whether the quota is enforced |
Quota Endpoints
Create Quota
Endpoint: POST /api/v1/billing/quotas
curl -X POST http://localhost:8087/api/v1/billing/quotas \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${TOKEN}" \
-d '{
"tenantId": "550e8400-e29b-41d4-a716-446655440000",
"resourceType": "API_CALLS",
"resourceName": "Monthly API Call Limit",
"quotaLimit": 1000000,
"unit": "count",
"warningThresholdPercent": 80,
"enforceHardLimit": true,
"alertRecipients": ["admin@acme.com"]
}'Check Quota
Endpoint: GET /api/v1/billing/quotas/tenants/:tenantId/check
Returns the utilization status for all quotas:
getUtilizationPercent()-- current usage as a percentage of the limitisOverQuota()-- whether current usage exceeds the limitisWarningThresholdExceeded()-- whether the warning threshold has been reached
List Tenant Quotas
Endpoint: GET /api/v1/billing/quotas/tenants/:tenantId
Update Quota
Endpoint: PUT /api/v1/billing/quotas/:quotaId
Delete Quota
Endpoint: DELETE /api/v1/billing/quotas/:quotaId
Enforcement Behavior
| Mode | Behavior |
|---|---|
Hard limit (enforceHardLimit: true) | Requests that would exceed the quota are rejected with 429 Too Many Requests |
Soft limit (enforceHardLimit: false) | Requests are allowed but alerts are generated when thresholds are exceeded |
When utilization reaches the warningThresholdPercent, alert notifications are sent to the configured recipients. Default warning threshold is 80%.