Base Docker Images
MATIH maintains a set of curated base Docker images that provide consistent, secure, and optimized foundations for all service containers. These base images are built and managed through scripts/tools/build-base-images.sh and stored in the Azure Container Registry at matihlabsacr.azurecr.io.
Base Image Inventory
| Image | Tag | Base | Purpose | Approximate Size |
|---|---|---|---|---|
matih/base-java | 1.0.0 | eclipse-temurin:21-jre-alpine | Java Spring Boot services | ~200 MB |
matih/base-python-ml | 1.0.0 | python:3.11-slim-bookworm | Python AI/ML services | ~600 MB |
matih/base-node | 1.0.0 | node:20-alpine | Node.js build + runtime | ~180 MB |
matih/base-nginx | 1.25-alpine | nginx:1.25-alpine | Frontend static file serving | ~40 MB |
Build Process
Building Base Images
# Build all base images
./scripts/tools/build-base-images.sh
# Build specific base image
./scripts/tools/build-base-images.sh --image base-java
# Build and push to registry
./scripts/tools/build-base-images.sh --pushBuild Script Flow
scripts/tools/build-base-images.sh
|
+-- Build base-java:1.0.0
| FROM eclipse-temurin:21-jre-alpine
| + security patches + CA certificates + non-root user
|
+-- Build base-python-ml:1.0.0
| FROM python:3.11-slim-bookworm
| + ML libraries + C extensions + non-root user
|
+-- Build base-node:1.0.0
| FROM node:20-alpine
| + build tools + non-root user
|
+-- Build base-nginx:1.25-alpine
| FROM nginx:1.25-alpine
| + security headers + non-root config
|
+-- Push all to matihlabsacr.azurecr.io/matih/base-java:1.0.0
The Java base image provides a minimal JRE runtime for Spring Boot services.
Dockerfile
FROM eclipse-temurin:21-jre-alpine
# Install security updates
RUN apk update && apk upgrade --no-cache && \
apk add --no-cache \
ca-certificates \
tzdata \
curl \
tini
# Create non-root user
RUN addgroup -g 1000 matih && \
adduser -u 1000 -G matih -D -h /app matih
# Set timezone
ENV TZ=UTC
# JVM defaults (overridable via JAVA_OPTS)
ENV JAVA_OPTS="-XX:+UseG1GC \
-XX:MaxGCPauseMillis=200 \
-XX:+UseStringDeduplication \
-XX:+OptimizeStringConcat \
-Djava.security.egd=file:/dev/./urandom \
-Dspring.output.ansi.enabled=NEVER"
# Health check
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD curl -f http://localhost:${SERVER_PORT:-8080}/actuator/health || exit 1
WORKDIR /app
USER matih
# Use tini as init process for proper signal handling
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["java", "-jar", "app.jar"]Key Design Decisions
| Decision | Rationale |
|---|---|
| Alpine-based | Minimal attack surface (~5 MB base OS) |
| JRE only (not JDK) | No compiler needed at runtime |
| G1GC as default | Best balance of throughput and latency |
| tini as init | Proper PID 1 signal handling, zombie process reaping |
| Non-root user (UID 1000) | Security: matches Kubernetes securityContext |
| urandom entropy source | Avoids JVM startup delay on entropy-starved containers |
JVM Tuning Profiles
Service charts override JAVA_OPTS based on workload profile:
# API service (low latency)
env:
- name: JAVA_OPTS
value: "-Xms512m -Xmx768m -XX:+UseG1GC -XX:MaxGCPauseMillis=100"
# Batch processing (high throughput)
env:
- name: JAVA_OPTS
value: "-Xms1g -Xmx2g -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -XX:ParallelGCThreads=4"
# Config service (minimal footprint)
env:
- name: JAVA_OPTS
value: "-Xms256m -Xmx512m -XX:+UseSerialGC"base-python-ml:1.0.0
The Python ML base image provides a runtime with pre-installed system dependencies for machine learning libraries.
Dockerfile
FROM python:3.11-slim-bookworm
# Install system dependencies for ML libraries
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
gcc \
g++ \
libpq-dev \
libffi-dev \
libssl-dev \
curl \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Install common ML/data dependencies
RUN pip install --no-cache-dir \
numpy==1.26.3 \
pandas==2.1.4 \
scipy==1.11.4 \
scikit-learn==1.3.2 \
tokenizers==0.15.0
# Create non-root user
RUN groupadd -g 1000 matih && \
useradd -u 1000 -g matih -m -d /app matih
# Set environment
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PYTHONFAULTHANDLER=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
WORKDIR /app
USER matih
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
CMD curl -f http://localhost:${PORT:-8000}/api/v1/health || exit 1
CMD ["python", "-m", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]Pre-installed Dependencies
| Package | Version | Rationale |
|---|---|---|
| numpy | 1.26.3 | Numerical computing foundation |
| pandas | 2.1.4 | DataFrame operations |
| scipy | 1.11.4 | Scientific computing |
| scikit-learn | 1.3.2 | ML utilities (preprocessing, metrics) |
| tokenizers | 0.15.0 | Fast text tokenization (HuggingFace) |
These packages are installed in the base image because:
- They require C extension compilation (slow to install)
- They are used by multiple Python services
- Pre-installing them dramatically reduces per-service build times
Python Environment Variables
| Variable | Value | Purpose |
|---|---|---|
PYTHONDONTWRITEBYTECODE | 1 | Skip .pyc file generation (read-only filesystem) |
PYTHONUNBUFFERED | 1 | Real-time log output (no buffering) |
PYTHONFAULTHANDLER | 1 | Print traceback on segfault |
PIP_NO_CACHE_DIR | 1 | Reduce image size |
PIP_DISABLE_PIP_VERSION_CHECK | 1 | Faster pip operations |
base-node:1.0.0
The Node.js base image provides a build environment for React/Vite frontend applications.
Dockerfile
FROM node:20-alpine
# Install build dependencies
RUN apk add --no-cache \
python3 \
make \
g++ \
curl \
ca-certificates
# Create non-root user
RUN addgroup -g 1000 matih && \
adduser -u 1000 -G matih -D -h /app matih
# Configure npm
RUN npm config set fund false && \
npm config set audit false && \
npm config set update-notifier false
WORKDIR /app
# Use non-root for builds
USER matih
CMD ["node"]Usage Pattern
The Node.js base image is primarily used as a build stage. The final runtime stage uses the NGINX base:
# Build stage
FROM matihlabsacr.azurecr.io/matih/base-node:1.0.0 AS builder
WORKDIR /build
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
# Runtime stage
FROM matihlabsacr.azurecr.io/matih/base-nginx:1.25-alpine
COPY --from=builder /build/dist /usr/share/nginx/htmlbase-nginx:1.25-alpine
The NGINX base image serves static frontend assets with security hardening.
Dockerfile
FROM nginx:1.25-alpine
# Install ca-certificates
RUN apk add --no-cache ca-certificates
# Remove default nginx config
RUN rm /etc/nginx/conf.d/default.conf
# Security: Remove server version
RUN sed -i 's/# server_tokens off;/server_tokens off;/' /etc/nginx/nginx.conf
# Default security headers config
COPY security-headers.conf /etc/nginx/conf.d/security-headers.conf
# Create non-root user
RUN addgroup -g 1000 matih && \
adduser -u 1000 -G matih -D matih && \
chown -R matih:matih /var/cache/nginx /var/log/nginx /etc/nginx/conf.d && \
touch /var/run/nginx.pid && \
chown matih:matih /var/run/nginx.pid
USER matih
EXPOSE 80
CMD ["nginx", "-g", "daemon off;"]Security Headers
# security-headers.conf
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
add_header Content-Security-Policy "default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; connect-src 'self' https://*.matih.ai wss://*.matih.ai" always;Multi-Stage Build Pattern
Every MATIH service follows a multi-stage Docker build pattern:
+-------------------+ +-------------------+ +-------------------+
| Stage 1: Deps | | Stage 2: Build | | Stage 3: Runtime |
| | | | | |
| FROM base-image |--->| FROM base-image |--->| FROM base-image |
| COPY manifest | | COPY --from=deps | | COPY --from=build |
| RUN install | | COPY source | | USER non-root |
| | | RUN compile | | CMD start |
+-------------------+ +-------------------+ +-------------------+
(Cached) (Rebuilt on (Minimal runtime)
code change)Benefits
| Benefit | Description |
|---|---|
| Smaller images | Runtime image contains only compiled artifacts, not build tools |
| Faster builds | Dependency layer is cached; only source code changes trigger rebuild |
| Security | Build tools, compilers, and source code excluded from runtime |
| Consistency | Same base image across all services of the same type |
Image Security Scanning
All base images and service images are scanned for vulnerabilities:
# Scan with Trivy
trivy image matihlabsacr.azurecr.io/matih/base-java:1.0.0
# Scan with Azure Defender
az acr repository show-tags \
--name matihlabsacr \
--repository matih/base-java \
--detailVulnerability Policy
| Severity | Policy | Action |
|---|---|---|
| Critical | Block deployment | Must be fixed before merge |
| High | Block deployment | Must be fixed within 7 days |
| Medium | Warning | Fix in next release cycle |
| Low | Informational | Track in backlog |
Base Image Update Process
Scheduled Updates
Base images are rebuilt monthly to incorporate security patches:
- Pull latest upstream image (e.g.,
eclipse-temurin:21-jre-alpine) - Apply MATIH customizations (user, packages, config)
- Run vulnerability scan
- Push with new patch version (e.g.,
1.0.1) - Update service Dockerfiles to reference new base
- Rebuild and test all services
- Deploy via CD pipeline
Emergency Security Patch
For critical CVEs affecting base images:
- Identify affected base image
- Rebuild immediately with patched upstream
- Push with incremented patch version
- Trigger rebuild of all affected service images
- Deploy hotfix via
service-build-deploy.sh
Troubleshooting
Common Base Image Issues
| Issue | Cause | Resolution |
|---|---|---|
| "permission denied" at runtime | UID mismatch between base and chart | Ensure securityContext.runAsUser matches Dockerfile USER |
| C extension build fails | Missing system libraries in base | Add required -dev packages to base Dockerfile |
| Image too large | Unnecessary files included | Add .dockerignore; verify multi-stage COPY |
| "exec format error" | Architecture mismatch (arm64 vs amd64) | Build multi-arch images with buildx |
| Health check fails | Wrong port or path | Verify HEALTHCHECK CMD matches service configuration |
Next Steps
- Next: Service Build and Deploy
- Previous: CD Pipeline