Glossary
This glossary defines over 150 terms used throughout the MATIH Enterprise Platform documentation. Terms are organized alphabetically within thematic categories to make it easier to find related concepts. When a term appears in other chapters of this documentation, it carries the meaning defined here.
Platform Concepts
| Term | Definition |
|---|---|
| Agent | A specialized AI component that performs a specific task within the multi-agent orchestration pipeline, such as intent classification, SQL generation, or visualization recommendation. |
| Agent Trace | A detailed record of every step an agent took during execution, including inputs, outputs, duration, token usage, and decisions made at each node in the LangGraph. |
| Agentic Workbench | The frontend application (port 3003) that provides a conversational interface for AI-driven data exploration, text-to-SQL, and DNN architecture building. |
| API Gateway | The Control Plane service (port 8080) that serves as the single entry point for all external API requests, handling routing, rate limiting, and CORS. |
| Billing Unit | The standard unit of measurement for platform usage: compute-hours for processing, storage-GB for data, and token-count for LLM usage. |
| BI Workbench | The frontend application (port 3000) for creating, viewing, and sharing dashboards, charts, and reports. |
| Checkpoint | A saved snapshot of LangGraph agent state that enables conversation persistence, replay, and recovery after failures. |
| Connector | A configured integration between the MATIH Platform and an external data source (database, cloud storage, SaaS application), managed through the Tenant Service. |
| Context Graph | A knowledge graph structure that captures relationships between data entities, semantic concepts, and business rules to improve AI agent reasoning. |
| Control Plane | The set of 10 microservices that manage platform-wide concerns: identity, tenants, configuration, auditing, billing, notifications, observability, infrastructure, registry, and API gateway. |
| Conversation | A sequence of user messages and AI responses within the Agentic Workbench, persisted with full agent trace history for audit and improvement. |
| Data Plane | The set of 14 microservices that handle tenant-specific data workloads: AI, ML, queries, BI, catalog, semantic layer, pipelines, data quality, governance, ontology, ops agents, rendering, and coordination. |
| Data Plane Agent | A lightweight Java service (port 8085) that coordinates data plane operations and acts as a sidecar for service discovery and health aggregation within tenant namespaces. |
| Data Workbench | The frontend application (port 3002) for data engineering tasks: pipeline management, data quality monitoring, and catalog browsing. |
| DNN Builder | A feature within the AI Service that uses LangGraph agents to help users design, validate, and generate code for deep neural network architectures via natural language conversation. |
| Domain Event | An immutable record of something significant that happened in the platform (e.g., tenant provisioned, query executed, model deployed), published to Kafka for consumption by other services. |
| Feature Flag | A configuration toggle managed by the Config Service that enables or disables platform features without code deployment. |
| Feedback Loop | The mechanism by which user feedback on AI responses (thumbs up/down, corrections) is captured and used to improve model performance over time. |
| Guardrail | A safety check within the AI agent pipeline that prevents harmful, biased, or inappropriate responses and enforces data access policies. |
| Intent | The classified purpose of a user's natural language query (e.g., data_query, visualization, explanation, general_chat), determined by the Intent Classifier agent. |
| Marketplace | The platform's catalog of templates, connectors, and extensions that can be installed into a tenant's workspace. |
| ML Workbench | The frontend application (port 3001) for machine learning tasks: experiment tracking, model training, deployment, and monitoring. |
| Module | A logically grouped set of functionality within the AI Service (core, biPlatform, mlPlatform, dataPlatform, contextGraph, enterprise, supplementary) that can be enabled or disabled via feature flags. |
| Multi-Agent Orchestration | The process by which multiple specialized AI agents are coordinated through a LangGraph directed graph to collectively answer a user query. |
| Ontology | A formal representation of domain concepts, their properties, and relationships, managed by the Ontology Service and used by AI agents for semantic understanding. |
| Ops Agent Service | An AI-powered operations management service (port 8080) with specialized agents for incident detection, root cause analysis, and automated remediation. |
| Platform Registry | The Control Plane service (port 8084) that maintains a registry of all services, their endpoints, health status, and dependency topology. |
| Provisioning | The multi-phase process of setting up a new tenant, including namespace creation, secret generation, service deployment, DNS configuration, and ingress setup. |
| Render Service | A Node.js service (port 8098) that performs server-side rendering of charts and dashboards for PDF/PNG export and scheduled report delivery. |
| Schema Embeddings | Vector representations of database schemas stored in Qdrant, enabling the AI Service to retrieve relevant schema context for SQL generation. |
| Semantic Layer | A service (port 8086) that provides a business-friendly abstraction over raw database tables, defining metrics, dimensions, and entities in business terms. |
| Session | A time-bounded interaction context for studio-mode conversations that tracks architecture state, preferences, and intermediate results. |
| Studio Mode | An extended interaction mode in the Agentic Workbench that maintains persistent state for complex, multi-step tasks like DNN architecture design. |
| Template | A pre-built, parameterized starting point for common platform tasks (dashboard layouts, pipeline definitions, ML experiments, agent configurations). |
| Tenant | An isolated organizational unit within the MATIH Platform, representing a customer with its own users, data, configurations, and resource quotas. |
| TenantContext | A thread-local (Java) or context-variable (Python) object that carries the current tenant's identity through every layer of a service, ensuring all operations are scoped to the correct tenant. |
| Text-to-SQL | The AI capability that converts natural language questions into executable SQL queries against the user's data sources. |
| Workbench | One of the purpose-built frontend applications (BI, ML, Data, Agentic, Control Plane UI, Data Plane UI) that provides a specialized user experience for a specific discipline. |
Kubernetes and Infrastructure Terms
| Term | Definition |
|---|---|
| AKS | Azure Kubernetes Service; the managed Kubernetes offering on Microsoft Azure used by MATIH for production deployments. |
| cert-manager | A Kubernetes add-on that automates the management and issuance of TLS certificates from various issuing sources, including Let's Encrypt. |
| ClusterIssuer | A cert-manager resource that defines a certificate authority (e.g., Let's Encrypt) available cluster-wide for issuing TLS certificates. |
| ConfigMap | A Kubernetes object used to store non-confidential configuration data in key-value pairs, consumed by pods as environment variables or mounted files. |
| Container | A lightweight, standalone executable package that includes everything needed to run a piece of software: code, runtime, system tools, and libraries. |
| CRD (Custom Resource Definition) | A Kubernetes extension mechanism that allows the platform to define custom resource types (e.g., KafkaTopic, SparkApplication, FlinkDeployment). |
| DaemonSet | A Kubernetes workload that ensures a pod runs on every node (or a subset of nodes), used for node-level agents like log collectors and monitoring agents. |
| Deployment | A Kubernetes workload object that manages a set of identical pods, handling rolling updates, rollbacks, and desired replica count. |
| EKS | Elastic Kubernetes Service; the managed Kubernetes offering on Amazon Web Services. |
| GKE | Google Kubernetes Engine; the managed Kubernetes offering on Google Cloud Platform. |
| Helm | The package manager for Kubernetes that uses charts (templated YAML) to define, install, and upgrade applications. |
| Helm Chart | A collection of Kubernetes resource templates, values files, and metadata that defines how an application is deployed on Kubernetes. |
| HPA (Horizontal Pod Autoscaler) | A Kubernetes controller that automatically scales the number of pod replicas based on CPU, memory, or custom metrics. |
| Ingress | A Kubernetes resource that manages external HTTP(S) access to services within the cluster, typically implemented by an NGINX Ingress Controller. |
| Init Container | A specialized container that runs before app containers in a pod, used for setup tasks like database migration, secret retrieval, or dependency checking. |
| Job | A Kubernetes resource that creates one or more pods and ensures that a specified number of them successfully terminate, used for batch and one-off tasks. |
| Kubelet | The primary node agent that runs on each Kubernetes node, responsible for managing pods and their containers. |
| Namespace | A Kubernetes resource that provides a mechanism for isolating groups of resources within a single cluster. MATIH uses dedicated namespaces for control plane, data plane, monitoring, and per-tenant workloads. |
| Network Policy | A Kubernetes resource that specifies how groups of pods are allowed to communicate with each other and other network endpoints. |
| Node | A physical or virtual machine in a Kubernetes cluster that runs pods. |
| Node Pool | A group of nodes within a Kubernetes cluster that share the same configuration (VM size, OS, labels, taints). MATIH uses separate pools for system, compute, and GPU workloads. |
| Operator | A Kubernetes design pattern that uses custom controllers and CRDs to automate the management of complex stateful applications (e.g., Strimzi for Kafka, KubeRay for Ray). |
| PDB (Pod Disruption Budget) | A Kubernetes resource that limits the number of pods that can be down simultaneously during voluntary disruptions, ensuring availability during upgrades. |
| Persistent Volume (PV) | A piece of storage in the cluster that has been provisioned by an administrator or dynamically via a StorageClass. |
| Persistent Volume Claim (PVC) | A request for storage by a user, specifying size and access modes; binds to a PV. |
| Pod | The smallest deployable unit in Kubernetes, consisting of one or more containers that share network and storage. |
| RBAC (Kubernetes) | Kubernetes Role-Based Access Control that regulates access to Kubernetes API resources based on roles bound to users or service accounts. |
| Secret | A Kubernetes object that stores sensitive data (passwords, tokens, certificates) in base64-encoded form, consumed by pods as environment variables or mounted files. |
| Service | A Kubernetes resource that provides stable networking for a set of pods, enabling load balancing and service discovery via DNS. |
| ServiceAccount | A Kubernetes identity assigned to pods, used for authenticating to the Kubernetes API and external services (via Workload Identity). |
| ServiceMonitor | A Prometheus Operator CRD that defines how Prometheus should discover and scrape metrics from Kubernetes services. |
| Sidecar | A container that runs alongside the main application container in a pod, providing cross-cutting functionality like logging, monitoring, or security (e.g., OPA sidecar). |
| StatefulSet | A Kubernetes workload for managing stateful applications (like databases), providing stable network identities, persistent storage, and ordered deployment. |
| Strimzi | A Kubernetes Operator for running Apache Kafka clusters, used by MATIH for all event streaming infrastructure. |
| Taint | A Kubernetes node property that repels pods unless those pods have a matching toleration, used to dedicate nodes to specific workloads (e.g., GPU nodes). |
| Toleration | A pod specification that allows the pod to be scheduled on nodes with matching taints. |
| VPA (Vertical Pod Autoscaler) | A Kubernetes controller that automatically adjusts CPU and memory resource requests for pods based on observed usage. |
| Workload Identity | A cloud-provider feature (Azure Workload Identity, GCP Workload Identity, AWS IRSA) that enables pods to authenticate to cloud APIs using Kubernetes service accounts without static credentials. |
Data Engineering Terms
| Term | Definition |
|---|---|
| Apache Airflow | An open-source workflow orchestration platform used by MATIH for scheduling and monitoring data pipelines. |
| Apache Flink | A distributed stream processing framework used by MATIH for real-time data processing, CDC, and streaming analytics. |
| Apache Iceberg | An open table format for large analytic datasets, providing ACID transactions, schema evolution, and time travel. MATIH uses Iceberg as the primary lakehouse format. |
| Apache Kafka | A distributed event streaming platform used by MATIH for inter-service communication, domain events, and data ingestion pipelines. |
| Apache Spark | A unified analytics engine for large-scale data processing, used by MATIH for batch ETL, feature engineering, and complex analytics. |
| Avro | A row-oriented data serialization format used for Kafka message serialization with Schema Registry-based schema evolution. |
| CDC (Change Data Capture) | A pattern for tracking changes in a source database and propagating them to downstream systems, implemented via Flink SQL in MATIH. |
| ClickHouse | A column-oriented OLAP database used by MATIH for high-performance analytical queries on pre-aggregated data. |
| Connector | A component that bridges the MATIH Platform with external data sources (databases, cloud storage, SaaS applications). |
| DAG (Directed Acyclic Graph) | A graph structure with directed edges and no cycles, used to define pipeline task dependencies in Airflow and Temporal. |
| Data Lakehouse | An architecture that combines the benefits of data lakes (cheap storage, schema-on-read) with data warehouses (ACID transactions, governance). MATIH uses Iceberg on object storage as its lakehouse layer. |
| Data Lineage | The end-to-end tracking of data from its origin through transformations to its final consumption, captured via OpenLineage. |
| Data Profiling | The process of examining a dataset to collect statistics (null counts, distinct values, distributions) for data quality assessment. |
| dbt (data build tool) | A transformation framework that enables analysts and engineers to write SQL-based transformations in a version-controlled project. |
| ELT (Extract, Load, Transform) | A data integration pattern where raw data is first loaded into the target system and then transformed in place, preferred over ETL for cloud-native analytics. |
| Hive Metastore | A metadata service that stores schema and partition information for tables, used by Trino for catalog resolution. |
| OpenLineage | An open standard for data lineage metadata collection and exchange, integrated with the MATIH Catalog Service. |
| OpenMetadata | An open-source metadata management platform used by MATIH for data discovery, governance, and collaboration. |
| Parquet | A columnar storage file format optimized for analytical workloads, used as the physical storage format for Iceberg tables. |
| Polaris | An open-source Iceberg catalog service that provides REST-based catalog management, credential vending, and access control for Iceberg tables. |
| Schema Registry | A service that manages schemas for Kafka messages, enabling schema evolution and compatibility checks. |
| StarRocks | A high-performance analytical database used by MATIH as an alternative OLAP engine for sub-second query responses on large datasets. |
| Temporal | A workflow orchestration platform used by MATIH for long-running, stateful data pipeline workflows with built-in retry and compensation. |
| Trino | A distributed SQL query engine (formerly PrestoSQL) used by MATIH as the federated query engine across all data sources. |
Machine Learning and AI Terms
| Term | Definition |
|---|---|
| A/B Testing (ML) | The practice of comparing two or more model versions by routing a percentage of traffic to each version and measuring performance differences. |
| Batch Inference | Running predictions on a large dataset at once, as opposed to real-time inference on individual requests. |
| Data Drift | A change in the statistical distribution of input data compared to the training data, which can degrade model performance. |
| Embedding | A dense vector representation of data (text, images, schemas) in a continuous vector space, used for similarity search and retrieval. |
| Experiment | A named collection of ML training runs that share a common objective, tracked in MLflow for comparison and reproducibility. |
| Feature Engineering | The process of creating, transforming, and selecting input features for machine learning models. |
| Feature Store | A centralized repository (Feast) for storing and serving feature values, ensuring consistency between training and serving. |
| Fine-Tuning | Adapting a pre-trained model to a specific domain or task by training on a smaller, domain-specific dataset. |
| GPU (Graphics Processing Unit) | A specialized processor used for parallel computation, essential for training and running large AI/ML models. |
| Guardrail | A safety mechanism that checks AI outputs for harmful content, hallucinations, data leakage, or policy violations before delivering to users. |
| Hallucination | An AI model generating information that is plausible-sounding but factually incorrect or unsupported by the input data. |
| Inference | The process of running a trained model on new input data to generate predictions or outputs. |
| LangGraph | A framework for building stateful, multi-agent AI applications using directed graphs, used by MATIH for agent orchestration. |
| LLM (Large Language Model) | A neural network model (GPT-4o, Claude, Gemini) trained on large text corpora, used by MATIH for natural language understanding and generation. |
| MLflow | An open-source platform for managing the ML lifecycle, including experiment tracking, model packaging, and model registry. |
| Model Registry | A centralized store for versioned ML models with metadata, stage management (staging, production, archived), and lineage. |
| Model Serving | The infrastructure for hosting trained models and making them available for real-time or batch inference. |
| Prompt Engineering | The practice of designing and optimizing input prompts to elicit desired outputs from large language models. |
| RAG (Retrieval-Augmented Generation) | A technique that enhances LLM responses by first retrieving relevant documents or data, then including them in the prompt context. |
| Ray | A distributed computing framework used by MATIH for scalable ML training, hyperparameter tuning, and model serving. |
| RLHF (Reinforcement Learning from Human Feedback) | A technique for aligning AI model behavior with human preferences by training a reward model from human feedback. |
| SSE (Server-Sent Events) | A server push technology for streaming data from server to client over HTTP, used for streaming AI responses. |
| Text-to-SQL | The AI capability of translating natural language questions into executable SQL queries. |
| Token | The smallest unit of text processed by an LLM; a word is typically 1-3 tokens. Token counts determine LLM API costs. |
| TOTP (Time-based One-Time Password) | An algorithm that generates short-lived, time-synchronized passwords for multi-factor authentication. |
| Transfer Learning | Applying knowledge from a pre-trained model to a new, related task, reducing the need for large training datasets. |
| Triton Inference Server | NVIDIA's model serving platform used by MATIH for high-performance GPU-accelerated inference. |
| Vector Database | A database optimized for storing and querying high-dimensional vectors, used for similarity search in RAG and semantic retrieval (Qdrant, ChromaDB). |
| vLLM | A high-throughput LLM serving engine used by MATIH for hosting open-source language models on GPU infrastructure. |
| WebSocket | A persistent, bidirectional communication protocol used by MATIH for real-time chat streaming between the Agentic Workbench and AI Service. |
Business Intelligence Terms
| Term | Definition |
|---|---|
| Dashboard | A visual layout containing one or more widgets (charts, tables, KPIs) that present data insights at a glance. |
| Dimension | A categorical attribute used for grouping and filtering data in BI queries (e.g., region, product category, date). |
| Drill-Down | The ability to navigate from a summary view to more detailed data by clicking on a chart element. |
| Filter | A condition applied to data to narrow the result set (e.g., date range, region selection, status). |
| KPI (Key Performance Indicator) | A quantifiable measure used to evaluate the success of an organization or process, displayed as a prominent metric on dashboards. |
| Measure | A quantitative value that can be aggregated (e.g., revenue, count of orders, average response time). Equivalent to "metric" in the Semantic Layer. |
| Metric | A named, reusable calculation defined in the Semantic Layer (e.g., total_revenue = SUM(orders.amount)). |
| Pivot Table | A data summarization tool that aggregates data by rows and columns, with measures computed at each intersection. |
| Report | A formatted, printable/exportable document containing data visualizations and narrative text. |
| Schedule | A recurring time-based trigger for report delivery via email, Slack, or webhook. |
| Semantic Model | A business-friendly abstraction defined in the Semantic Layer that maps business concepts (metrics, dimensions, entities) to underlying database tables. |
| Slice | A predefined filter combination that can be applied to a dashboard to view data from a specific perspective. |
| Widget | An individual visual component on a dashboard: bar chart, line chart, pie chart, table, KPI card, text, filter control, or map. |
| Workspace | A named container within a tenant for organizing dashboards, queries, pipelines, and other artifacts by project or team. |
Data Governance and Quality Terms
| Term | Definition |
|---|---|
| Access Request | A formal request submitted through the Governance Service to gain access to a protected dataset or resource, subject to approval workflows. |
| Anomaly Detection (Data Quality) | The automated identification of data values that deviate significantly from expected patterns, used to flag data quality issues. |
| Business Glossary | A curated dictionary of business terms and their definitions, maintained in the Catalog Service to provide shared understanding across teams. |
| Classification | A label applied to data columns or tables indicating the sensitivity level (e.g., Public, Internal, Confidential, Restricted, PII) used by the Governance Service for access control. |
| Column Masking | A governance policy that replaces sensitive column values with masked or redacted versions (e.g., replacing SSN with ***-**-1234) based on the requesting user's role. |
| Compliance Report | A generated document from the Audit Service that summarizes access patterns, policy violations, and data handling practices for regulatory compliance purposes. |
| Data Contract | A formal agreement between data producers and consumers that specifies schema, quality expectations, SLAs, and ownership for a dataset. |
| Data Freshness | A quality metric that measures how recently a dataset was updated, used to ensure downstream consumers are working with current data. |
| Data Owner | The person or team responsible for the accuracy, privacy, and lifecycle management of a specific dataset within the platform. |
| Data Quality Rule | A configurable validation check that evaluates data against expected criteria (not null, within range, referential integrity, custom SQL predicate). |
| Data Quality Score | A numeric score (0-100) computed by the Data Quality Service that summarizes the overall quality of a table based on the results of all applicable quality rules. |
| Data Steward | A role responsible for ensuring that data governance policies are implemented and followed within a business domain. |
| Metadata | Descriptive information about data (schemas, column types, tags, lineage, statistics) stored in the Catalog Service and used by AI agents for context. |
| Policy | A declarative rule managed by the Governance Service that controls access to data, defines masking behavior, or enforces retention periods. |
| Retention Policy | A governance rule that specifies how long data or audit events must be retained before they can be archived or deleted. |
| Row Filtering | A governance policy that limits the rows a user can see in a query result based on their attributes (e.g., a regional manager only sees data for their region). |
| Schema Drift | An unexpected change in the structure of a data source (added columns, changed types, renamed fields) that may break downstream pipelines or queries. |
| SLA (Data Quality) | A Service Level Agreement that defines the expected freshness, completeness, and accuracy thresholds for a dataset, with automated alerting when thresholds are breached. |
| Tag | A metadata label applied to tables, columns, or other catalog entities for categorization, search, and governance (e.g., pii, financial, deprecated). |
Networking and Protocol Terms
| Term | Definition |
|---|---|
| CIDR | Classless Inter-Domain Routing; a notation for specifying IP address ranges (e.g., 10.0.0.0/8) used in Kubernetes network policies and cloud networking. |
| DNS Zone | A portion of the DNS namespace delegated to a specific authority; MATIH creates per-tenant DNS zones for custom domain support. |
| FQDN | Fully Qualified Domain Name; the complete domain name for a Kubernetes service (e.g., ai-service.matih-data-plane.svc.cluster.local). |
| gRPC | A high-performance RPC framework using Protocol Buffers and HTTP/2, used by some MATIH internal services for low-latency communication. |
| HTTP/2 | The second major version of the HTTP protocol, supporting multiplexing, header compression, and server push; used by NGINX ingress for improved performance. |
| LoadBalancer | A Kubernetes service type that provisions an external load balancer (cloud-provider specific) to expose a service to the internet. |
| mTLS | Mutual TLS; a security protocol where both client and server authenticate each other via certificates, used for Kafka client-broker communication. |
| NGINX Ingress Controller | The Kubernetes ingress controller used by MATIH to route external HTTP/HTTPS traffic to internal services, with support for WebSocket, rate limiting, and TLS termination. |
| OTLP | OpenTelemetry Protocol; the standard protocol for transmitting telemetry data (traces, metrics, logs) from services to the OpenTelemetry Collector. |
| REST | Representational State Transfer; the architectural style used for all MATIH public APIs, based on HTTP methods and JSON payloads. |
| Service Mesh | A dedicated infrastructure layer for managing service-to-service communication, providing observability, security, and traffic management (considered for future adoption). |
| WebSocket | A persistent, full-duplex communication protocol over a single TCP connection, used for real-time AI chat streaming and collaborative dashboard editing. |
Security Terms
| Term | Definition |
|---|---|
| ABAC (Attribute-Based Access Control) | An authorization model that evaluates policies based on attributes of the user, resource, action, and environment. Implemented via OPA in MATIH. |
| Bearer Token | An HTTP authentication scheme where the client sends a token (typically JWT) in the Authorization: Bearer {token} header. |
| CORS (Cross-Origin Resource Sharing) | An HTTP mechanism that allows a web application running at one origin to access resources from a different origin. |
| CSRF (Cross-Site Request Forgery) | An attack that tricks a user into performing unintended actions on a web application where they are authenticated. Prevented via state parameters in OAuth2 flows. |
| Defense in Depth | A security strategy that employs multiple layers of defense (network policies, RBAC, RLS, encryption) so that if one layer fails, others still protect the system. |
| Encryption at Rest | Protecting stored data by encrypting it on disk, implemented via cloud-provider managed encryption keys for all MATIH storage. |
| Encryption in Transit | Protecting data during transmission by encrypting network traffic, implemented via TLS for all MATIH service-to-service and client-to-server communication. |
| GDPR | General Data Protection Regulation; European Union regulation governing the collection, storage, and processing of personal data. |
| HIPAA | Health Insurance Portability and Accountability Act; U.S. regulation governing the protection of health information. |
| JWT (JSON Web Token) | A compact, URL-safe token format used by MATIH for authentication, containing signed claims about the user's identity, tenant, and permissions. |
| MFA (Multi-Factor Authentication) | An authentication method requiring two or more verification factors (password + TOTP code, biometric, etc.). |
| Network Policy | A Kubernetes resource that controls network traffic between pods, implementing micro-segmentation for zero-trust security. |
| OAuth2 | An authorization framework that enables third-party applications to obtain limited access to a web service, used for SSO integration. |
| OPA (Open Policy Agent) | A general-purpose policy engine used by MATIH for fine-grained authorization decisions beyond simple RBAC. |
| PII (Personally Identifiable Information) | Data that can identify an individual (name, email, SSN), subject to special handling and masking policies in the Governance Service. |
| RBAC (Application) | Role-Based Access Control at the application level, where users are assigned roles (TENANT_ADMIN, DATA_ENGINEER) that grant specific permissions. |
| Rego | The declarative policy language used by OPA for writing authorization policies. |
| RLS (Row-Level Security) | A PostgreSQL feature that restricts which rows a user can access in a table, used as defense-in-depth for multi-tenant data isolation. |
| SAML | Security Assertion Markup Language; an XML-based standard for exchanging authentication and authorization data between identity providers and service providers. |
| SCIM (System for Cross-domain Identity Management) | A standard protocol for automating the exchange of user identity information between identity domains and IT systems. |
| SOC 2 | Service Organization Control 2; an auditing framework for technology companies that specifies how organizations should manage customer data. |
| TLS (Transport Layer Security) | A cryptographic protocol that provides secure communication over a network, used for all MATIH HTTPS endpoints and inter-service communication. |
| Workload Identity | A cloud-native mechanism for assigning cloud IAM identities to Kubernetes pods without static credentials. |
| Zero Trust | A security model that assumes no implicit trust and requires verification for every access request, regardless of network location. |
Acronyms
| Acronym | Expansion |
|---|---|
| ACR | Azure Container Registry |
| ADR | Architecture Decision Record |
| AKS | Azure Kubernetes Service |
| API | Application Programming Interface |
| ARB | Architecture Review Board |
| BI | Business Intelligence |
| CDC | Change Data Capture |
| CI/CD | Continuous Integration / Continuous Deployment |
| CLI | Command Line Interface |
| CNCF | Cloud Native Computing Foundation |
| CRD | Custom Resource Definition |
| DAG | Directed Acyclic Graph |
| DNN | Deep Neural Network |
| DNS | Domain Name System |
| EKS | Elastic Kubernetes Service |
| ELT | Extract, Load, Transform |
| ESO | External Secrets Operator |
| FQDN | Fully Qualified Domain Name |
| GKE | Google Kubernetes Engine |
| GPU | Graphics Processing Unit |
| gRPC | Google Remote Procedure Call |
| HPA | Horizontal Pod Autoscaler |
| HTTP | Hypertext Transfer Protocol |
| IAM | Identity and Access Management |
| IRSA | IAM Roles for Service Accounts (AWS) |
| JSON | JavaScript Object Notation |
| JWT | JSON Web Token |
| K8s | Kubernetes (abbreviation) |
| KPI | Key Performance Indicator |
| LLM | Large Language Model |
| ML | Machine Learning |
| MFA | Multi-Factor Authentication |
| NS | Nameserver |
| OLAP | Online Analytical Processing |
| OLTP | Online Transaction Processing |
| OPA | Open Policy Agent |
| PDB | Pod Disruption Budget |
| PII | Personally Identifiable Information |
| PV | Persistent Volume |
| PVC | Persistent Volume Claim |
| QoS | Quality of Service |
| RAG | Retrieval-Augmented Generation |
| RBAC | Role-Based Access Control |
| REST | Representational State Transfer |
| RLS | Row-Level Security |
| RPO | Recovery Point Objective |
| RTO | Recovery Time Objective |
| SAML | Security Assertion Markup Language |
| SCIM | System for Cross-domain Identity Management |
| SDK | Software Development Kit |
| SLA | Service Level Agreement |
| SLO | Service Level Objective |
| SOC | Service Organization Control |
| SQL | Structured Query Language |
| SRE | Site Reliability Engineering |
| SSE | Server-Sent Events |
| SSO | Single Sign-On |
| TLS | Transport Layer Security |
| TOTP | Time-based One-Time Password |
| TTL | Time to Live |
| UI | User Interface |
| VPA | Vertical Pod Autoscaler |
| WS | WebSocket |
| YAML | YAML Ain't Markup Language |