Platform Architecture

SDL is a layered, modular platform designed for tactical and enterprise data operations. This page provides a high-level architectural overview with pointers to detailed documentation for each capability area.

Architecture Overview

The platform is organized into horizontal layers, each responsible for a distinct set of capabilities. Layers communicate through well-defined interfaces, enabling independent deployment, scaling, and evolution.

Figure 1. Platform Architecture

Architecture Layers

Ingestion Layer

The ingestion layer connects SDL to external data sources. It supports connectors for event streaming, gRPC, TAK servers, TCP, PostGIS, and other protocols.

The layer is format-agnostic: 14+ tactical data formats (including CoT, AIS, ADS-B, Link-16, GCCS-J, SIGINT, and others) are converted to the canonical data model through configurable transformers. This decouples the platform from source-specific formats, enabling new sources to be added without modifying downstream components.

Processing Layer

The processing layer handles data transformation, enrichment, and schema validation. It provides the following integration patterns:

Native Transformation Path: Data flows through transformers without being forced into the data model. This supports coalition interoperability, legacy system integration, exploratory data ingestion, and high-fidelity passthrough where downstream systems require the source format.

Both paths are processed through the transformation hub, which manages pipeline orchestration, error handling, and monitoring.

Query and Analytics Layer

The query and analytics layer provides multiple access patterns for consuming data:

Federated SQL — A distributed SQL engine that queries across heterogeneous data sources (relational databases, object storage, event streams) through a single SQL interface.
Virtual Knowledge Graph (VKG) — Ontology-Based Data Access (OBDA) maps data to BFO/CCO ontology terms, enabling SPARQL queries across data from both the Data Model Path and Native Transformation Path.
Real-time analytics — Low-latency analytics on streaming data for operational dashboards and alerts.
Data visualization — Interactive dashboards for operational monitoring, data exploration, and reporting.

Storage Layer

The storage layer implements a tiered architecture for cost-effective, performance-optimized data retention:

Hot tier — Low-latency access for active operational data. Recent entities, tasks, and streaming updates.
Warm tier — Balanced latency and cost for recent historical data. Queryable through federated SQL.
Cold tier — Object storage (S3-compatible) for long-term retention and archival.

The metadata lake follows a Bronze-Silver-Gold pattern:

Bronze — Raw ingested data in source format.
Silver — Cleaned, validated, and schema-conformant data.
Gold — Enriched, correlated, and analytics-ready data products.

Security and Governance Layer

Security and governance are enforced across all other layers:

Policy engine — Attribute-based access control (ABAC) with fine-grained policy evaluation.
Classification markings — ISM-format security markings on every entity and task.
Obfuscation — Row-level and column-level data obfuscation based on the requester’s clearance and need-to-know.
Audit logging — Complete data lineage and access tracking for compliance and after-action review.

API Layer

The API layer provides unified access to all platform capabilities:

REST API — JSON over HTTP for web integrations, scripting, and ad-hoc queries.
gRPC API — High-performance protobuf-based access with bidirectional streaming.
SDKs — Client libraries in Go, Java, and Python with built-in connection management, retry logic, and TLS configuration.

Capability-to-Documentation Mapping

Use the following table to navigate from a capability area to its detailed documentation:

Capability	Documentation
Event Streaming	Event Streaming
Object Storage	Object Storage
Federated SQL	Federated SQL Engine
Operational Monitoring	Operational Monitoring
Data Science Notebooks	JupyterHub
Identity and Access Management	Identity Management
Data Catalog	Data Catalog
Data Pipelines	Transformation Pipelines
GeoServer	GeoServer

Capability

Documentation

Event Streaming

Object Storage

Federated SQL

Federated SQL Engine

Operational Monitoring

Data Science Notebooks

JupyterHub

Identity and Access Management

Identity Management

Data Catalog

Data Pipelines

Transformation Pipelines

GeoServer

Hierarchical Mesh-Federation Architecture

SDL deployments typically follow a hierarchical architecture that mirrors military echelon structures.

Figure 2. Hierarchical Architecture

Echelon Hierarchy

The architecture supports multi-echelon deployments where each level operates as an autonomous SDL instance:

Platoon: Edge-deployed nodes with constrained compute and bandwidth. Nodes within a platoon sync bidirectionally to maintain a shared operational picture. Platoon nodes push upward to the Company echelon through unidirectional edges.
Company: Aggregates data from multiple platoon nodes. Company-level nodes maintain the combined picture from all subordinate platoons. Company nodes push upward to Battalion through unidirectional edges.
Battalion and Above: Enterprise-level aggregation with full compute and bandwidth capacity. Receives data from all subordinate echelons and provides the widest operational picture. May also push data downward (e.g., tasking, intelligence products) through directed edges.

Mesh Within, Federation Between

Each echelon level runs its own internal mesh:

Within an echelon: Nodes sync bidirectionally, forming a fully connected mesh. Any node’s update propagates to all other nodes at the same echelon level.
Between echelons: Directed (typically unidirectional) edges connect echelons. Data flows upward for aggregation and selectively downward for tasking and dissemination.

This architecture provides resilience at each level — if the inter-echelon link goes down, nodes within an echelon continue to operate and sync with each other. When the link recovers, delta sync brings the higher echelon back up to date.

Bidirectional vs. Unidirectional Edges

Edge Type	Usage
Bidirectional	Within an echelon, between peers at the same classification level. Full convergence — both nodes have the same data.
Unidirectional (push up)	From lower echelon to higher echelon. Data flows from edge to enterprise for aggregation.
Unidirectional (push down)	From higher echelon to lower echelon. Tasking, intelligence products, and reference data flow from enterprise to edge.
Cross-domain	Between classification levels. Data flows through cross-domain guards with XSD validation. Typically unidirectional from lower to higher classification.

Edge Type

Usage

Bidirectional

Within an echelon, between peers at the same classification level. Full convergence — both nodes have the same data.

Unidirectional (push up)

From lower echelon to higher echelon. Data flows from edge to enterprise for aggregation.

Unidirectional (push down)

From higher echelon to lower echelon. Tasking, intelligence products, and reference data flow from enterprise to edge.

Cross-domain

Between classification levels. Data flows through cross-domain guards with XSD validation. Typically unidirectional from lower to higher classification.