Federated Query & Virtual Knowledge Graph Demo
This guide walks through a step-by-step demonstration of SDL federated query capabilities. It showcases "zero ETL" — querying data where it lives without movement or duplication.
Prerequisites
-
SDL deployment with federated SQL and VKG engine enabled
-
Sample data loaded in at least 2 different data sources (for example, a relational database and a streaming topic)
-
SPARQL endpoint access
-
curlor SDK client installed
Demo Overview
This demo illustrates the following capabilities:
-
Federated SQL — query across multiple heterogeneous data sources with a single SQL statement.
-
Cross-source joins — join data from different source types without pre-staging or ETL.
-
Virtual Knowledge Graph (VKG) — query an ontology-mapped knowledge graph using SPARQL.
-
Natural language query — ask questions in plain English and receive structured results.
-
Asynchronous analytics — submit long-running queries and retrieve results when ready.
Step 1: Identify Data Sources
List the data sources registered with the federated query engine.
curl -s https://sdl.example.com/api/v1/query/sources | jq .
Example response:
{
"sources": [
{
"name": "operations_db",
"type": "relational",
"description": "Operational relational database"
},
{
"name": "sensor_stream",
"type": "streaming",
"description": "Real-time sensor data topic"
},
{
"name": "reports_store",
"type": "object-storage",
"description": "Archived reports in object storage"
}
]
}
Step 2: Simple Federated SQL Query
Run a SQL query that targets a single data source through the federated engine.
curl -X POST https://sdl.example.com/api/v1/query/sql \
-H "Content-Type: application/json" \
-d '{
"query": "SELECT sensor_id, temperature, location, timestamp FROM operations_db.sensor_readings WHERE temperature > 80.0 ORDER BY timestamp DESC LIMIT 10"
}'
The federated engine routes the query to the appropriate data source and returns results in a unified format.
Step 3: Cross-Source Join
Join data from a relational database and a streaming topic in a single query.
curl -X POST https://sdl.example.com/api/v1/query/sql \
-H "Content-Type: application/json" \
-d '{
"query": "SELECT r.sensor_id, r.location, s.current_temp, s.event_time FROM operations_db.sensor_registry r JOIN sensor_stream.readings s ON r.sensor_id = s.sensor_id WHERE s.current_temp > r.threshold ORDER BY s.event_time DESC LIMIT 20"
}'
This query joins static registration data from the relational database with live readings from the streaming topic — without moving data between systems.
Step 4: SPARQL Ontology Query
Query the virtual knowledge graph using SPARQL.
curl -X POST https://sdl.example.com/api/v1/query/sparql \
-H "Content-Type: application/sparql-query" \
-d 'PREFIX sdl: <http://sdl.example.com/ontology#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?entity ?type ?location
WHERE {
?entity a rdp:SensorReading ;
rdp:locatedIn ?location ;
rdp:readingType ?type .
FILTER (?type = rdp:Temperature)
}
LIMIT 25'
The VKG engine maps the SPARQL query to the underlying data sources through the ontology, returning results without requiring the user to know where the data physically resides.
Step 5: Natural Language Query
Submit a question in plain English and let the platform translate it into the appropriate query.
curl -X POST https://sdl.example.com/api/v1/query/natural-language \
-H "Content-Type: application/json" \
-d '{
"question": "Which sensors in building 7 have reported temperatures above 80 degrees in the last hour?"
}'
Example response:
{
"answer": "3 sensors in building 7 reported temperatures above 80\u00b0F in the last hour.",
"generated_query": "SELECT sensor_id, temperature, timestamp FROM operations_db.sensor_readings WHERE location = 'building-7' AND temperature > 80.0 AND timestamp >= NOW() - INTERVAL '1 hour'",
"results": [
{ "sensor_id": "sensor-alpha-01", "temperature": 82.1, "timestamp": "2026-02-26T14:32:00Z" },
{ "sensor_id": "sensor-alpha-04", "temperature": 85.7, "timestamp": "2026-02-26T14:28:00Z" },
{ "sensor_id": "sensor-alpha-07", "temperature": 80.3, "timestamp": "2026-02-26T14:15:00Z" }
]
}
Step 6: Asynchronous Analytics
Submit a long-running analytical query and retrieve results asynchronously.
# Submit the query
curl -X POST https://sdl.example.com/api/v1/query/async \
-H "Content-Type: application/json" \
-d '{
"query": "SELECT location, AVG(temperature) as avg_temp, MAX(temperature) as max_temp, COUNT(*) as reading_count FROM operations_db.sensor_readings GROUP BY location ORDER BY avg_temp DESC"
}'
Example response:
{
"query_id": "q-abc123-def456",
"status": "RUNNING",
"submitted_at": "2026-02-26T14:35:00Z"
}
Poll for completion and retrieve results:
# Check query status
curl -s https://sdl.example.com/api/v1/query/async/q-abc123-def456/status | jq .
# Retrieve results once status is COMPLETED
curl -s https://sdl.example.com/api/v1/query/async/q-abc123-def456/results | jq .
Expected Results
| Step | Expected Behavior |
|---|---|
Step 1: Identify Data Sources |
All registered data sources are listed with their names, types, and descriptions. |
Step 2: Simple Federated SQL Query |
Results are returned from the target data source in a unified JSON format. |
Step 3: Cross-Source Join |
Data from the relational database and streaming topic is joined and returned as a single result set. |
Step 4: SPARQL Ontology Query |
The VKG engine resolves the SPARQL query against the ontology and returns matching entities. |
Step 5: Natural Language Query |
The platform translates the English question into a structured query, executes it, and returns both the answer and the generated query. |
Step 6: Asynchronous Analytics |
The query is accepted and assigned a |