Trino — Federated SQL Engine
SDL leverages a distributed SQL query engine to federate queries across the data lake, relational databases, streaming topics, and other data sources.
Trino is required to be deployed with HTTPS enabled as well as JWT as the authentication type. == Connecting to Trino via API 1. From the SDL UI, navigate to the SDL OpenAPI spec and select the Lakehouse link. 2. Once on the Scalar API reference page, click on Authentication 3. You have an option of providing a username and password or JWT for authentication. Authentication will be handled by Keycloak and authorization will be handled by OPA policies. 4. Execute any of the APIs listed available for Lakehouse. == Connecting to Trino via CLI . Requirements to connect via CLI is that you should be an admin and should have updated the /etc/hosts for trino.localhost or any DNS used other than localhost to correctly map the subdomain to the correct local or VM IP address. . SSH onto Trino Coordinator. . Example connecting from CLI
+
trino --server trino.localhost:443 --user=admin --insecure --access-token=<token> --catalog tpch
-
CLI connection template
trino --server trino.<DNS>:443 --user=<username> --insecure --access-token=<token> --catalog <catalog name>
Connecting to Trino via JDBC
Example using Python sqlalchemy. User will need to update DNS and catalog.
-
Pre-requirements: user must have JWT and OPA policy access.
-
Launch python 3 in a terminal.
-
Execute the following.
-
Connection example
import sqlalchemy as db from sqlalchemy.sql.expression import select, text from trino.auth import JWTAuthentication engine = db.create_engine('trino://admin@trino.localhost:443/tpch',connect_args={ "auth": JWTAuthentication("<token>"), "http_scheme":"https", "verify": False}) connection = engine.connect() rows = connection.execute(text('select * from sf1.region')).fetchall() print(rows) -
Connection template
import sqlalchemy as db from sqlalchemy.sql.expression import select, text from trino.auth import JWTAuthentication engine = db.create_engine('trino://<user>@trino.<domain>:443/<catalog name>',connect_args={ "auth": JWTAuthentication("<token>"), "http_scheme":"https", "verify": False}) connection = engine.connect() rows = connection.execute(text('<query>')).fetchall() print(rows)