Embeddings Service Chart
The Embeddings Service provides centralized vector embedding operations over Qdrant with per-user RBAC, tenant isolation, and audit logging. All Python AI services connect through this Java service rather than directly to Qdrant.
Chart Configuration
# From infrastructure/helm/data-plane/embeddings-service/values.yaml
billing:
costCenter: "CC-DATA-PLANE"
application: "data-plane"
team: "platform"
workloadType: "api"
costType: "dynamic"
replicaCount: 1
image:
registry: matihlabsacr.azurecr.io
repository: matih/embeddings-service
tag: "latest"
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 8213
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: 1000m
memory: 1024MiDependencies
| Dependency | Connection | Purpose |
|---|---|---|
| PostgreSQL | JDBC (5432) | Embedding collection metadata, records, usage tracking |
| Qdrant | REST (6333) | Vector storage and similarity search |
| Redis | TCP (6379) | Response caching |
| Kafka | TCP (9092/9093) | Event publishing (matih.embeddings.events) |
Secrets
| Secret Name | Keys | Source |
|---|---|---|
embeddings-service-db-secret | username, password | PostgreSQL credentials |
redis-credentials | redis-password | Shared Redis password |
qdrant-credentials | api-key | Qdrant API key (optional in dev) |
matih-jwt-secret | jwt-secret | JWT validation (shared) |
Environment Variables
Injected automatically by matih.deployment.spring base template:
| Variable | Source | Description |
|---|---|---|
SPRING_DATASOURCE_URL | Helm values | JDBC connection to PostgreSQL |
DB_HOST, DB_PORT, DB_NAME | Helm values | Database connection details |
REDIS_HOST, REDIS_PORT | Helm values | Redis connection |
JWT_SECRET | Secret ref | JWT validation key |
SPRING_KAFKA_BOOTSTRAP_SERVERS | Auto | Kafka connection |
Additional service-specific variables via extraEnv:
| Variable | Source | Description |
|---|---|---|
QDRANT_URL | Helm values | Qdrant REST endpoint |
QDRANT_API_KEY | Secret ref | Qdrant authentication (optional) |
QDRANT_TIMEOUT_MS | Helm values | Request timeout (default: 30000) |
Network Policy
Ingress allows traffic from:
- Data plane services (
matih-data-planenamespace) - NGINX Ingress Controller (
matih-ingressnamespace) - Prometheus scraping (
matih-monitoringnamespace)
Egress allows connections to:
- DNS (53)
- PostgreSQL (5432)
- Redis (6379)
- Qdrant (6333, 6334)
- Kafka (9092, 9093)
- OpenTelemetry Collector (4317)
Health Probes
| Probe | Path | Initial Delay | Period |
|---|---|---|---|
| Startup | /actuator/health/liveness | 10s | 10s (30 retries) |
| Liveness | /actuator/health/liveness | 60s | 30s |
| Readiness | /actuator/health/readiness | 30s | 10s |
Monitoring
ServiceMonitor scrapes /actuator/prometheus every 30 seconds.
Key metrics:
embeddings_search_duration_seconds-- Search latency histogramembeddings_upsert_count_total-- Vectors upserted counterembeddings_collection_count-- Active collections gauge
Dev Overrides
# From values-dev.yaml
replicaCount: 1
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 1Gi
autoscaling:
enabled: false
podDisruptionBudget:
enabled: false