Structured Logging

MATIH uses structured logging to produce machine-parseable log output with consistent field names across all services. Python services use structlog for JSON-formatted logs, while Java services use Spring Boot's built-in structured logging with Logback.

Python Services (structlog)

Configuration

import structlog
 
structlog.configure(
    processors=[
        structlog.contextvars.merge_contextvars,
        structlog.stdlib.add_log_level,
        structlog.stdlib.add_logger_name,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.StackInfoRenderer(),
        add_trace_context,          # Custom: adds trace_id, span_id
        add_tenant_context,         # Custom: adds tenant_id
        structlog.processors.JSONRenderer(),
    ],
    wrapper_class=structlog.stdlib.BoundLogger,
    context_class=dict,
    logger_factory=structlog.stdlib.LoggerFactory(),
)

Output Format

{
  "event": "Processing search request",
  "level": "info",
  "logger": "context_graph.services.semantic_search_service",
  "timestamp": "2025-06-15T10:30:00.123Z",
  "trace_id": "abc123def456",
  "span_id": "789012345678",
  "tenant_id": "acme",
  "query": "customer churn models",
  "mode": "hybrid",
  "results_count": 15,
  "duration_ms": 125.4
}

Binding Context

logger = structlog.get_logger()
 
# Bind context for all subsequent log calls
log = logger.bind(
    tenant_id=tenant_id,
    session_id=session_id,
    request_id=request_id,
)
 
log.info("Search started", query=query, mode=mode)
log.info("Search completed", results_count=len(results), duration_ms=duration)

Java Services (Spring Boot)

Logback Configuration

<configuration>
  <appender name="JSON" class="net.logstash.logback.appender.LogstashTcpSocketAppender">
    <encoder class="net.logstash.logback.encoder.LogstashEncoder">
      <includeMdcKeyName>traceId</includeMdcKeyName>
      <includeMdcKeyName>spanId</includeMdcKeyName>
      <includeMdcKeyName>tenantId</includeMdcKeyName>
    </encoder>
  </appender>
</configuration>

MDC Context

MDC.put("tenantId", tenantContext.getTenantId());
MDC.put("userId", userContext.getUserId());
log.info("Provisioning started for tenant {}", tenantId);

Standardized MDC Log Pattern

All 10 control-plane services use a unified console log pattern that includes tenantId and correlationId from MDC:

logging:
  pattern:
    console: "%d{ISO8601} [%thread] %-5level [%X{tenantId:-}] [%X{correlationId:-}] %logger{36} - %msg%n"

This pattern is configured in every service's application.yml:

Service	MDC Fields
tenant-service	tenantId, correlationId, provisioningJobId, stepType
iam-service	tenantId, correlationId
billing-service	tenantId, correlationId
notification-service	tenantId, correlationId
audit-service	tenantId, correlationId
config-service	tenantId, correlationId
api-gateway	tenantId, correlationId
observability-api	tenantId, correlationId
infrastructure-service	tenantId, correlationId
platform-registry	tenantId, correlationId

The :- syntax in %X{tenantId:-} ensures empty string output (not null) when the MDC value is not set.

Example Log Output

2026-02-20T14:23:15.432 [http-nio-8082-exec-3] INFO  [acme-corp] [req-abc123] c.m.t.s.ProvisioningService - Provisioning step CREATE_NAMESPACE completed in 2341ms

Standard Log Fields

All log entries should include these fields:

Field	Description	Required
`timestamp`	ISO 8601 timestamp	Yes
`level`	Log level (info, warning, error)	Yes
`event` / `message`	Human-readable description	Yes
`logger`	Logger name / module path	Yes
`trace_id`	Distributed trace ID	When available
`span_id`	Current span ID	When available
`tenant_id`	Tenant identifier	When in tenant context
`service`	Service name	Yes

Sensitive Data

Logs must not contain:

Passwords or API keys
Personal Identifiable Information (PII) unless masked
Full SQL queries with parameter values
JWT tokens or session secrets

Use structlog processors to filter sensitive fields before output.

Logging Architecture Loki Setup