Upload a parquet file to Minio and query from Trino

How To: Upload a parquet file to Minio and query from Trino

To accomplish this following these instructions, you will be required to have Trino CLI app installed.
  • Upload the target parquet (.parquet) file to the corresponding s3 bucket via UI, CLI or Inbox.

  • Login to via trino cli:

    • An example command for logging in using Trino CLI:

https://trino.{ base_url_of_data-fabric } --user=admin --access-token={ access_token } --catalog df-hive
If your local instance of SDL does not have proper certs issued, you can still successfully accomplish this by adding the --insecure flag to the previous command
To find the correct token, you can use the bash script packaged with SDL repo, located here: data-fabric/hacks/df_token.sh
  • Create a table via Trino CLI:

    • Here is an example table one might use:

CREATE TABLE orders
(
    o_orderkey       BIGINT ,
    o_custkey        BIGINT ,
    o_orderstatus    CHAR(1) ,
    o_totalprice     DOUBLE PRECISION ,
    o_orderdate      DATE ,
    o_orderpriority  CHAR(15) ,
    o_clerk          CHAR(15) ,
    o_shippriority   INTEGER ,
    o_comment        VARCHAR(79)
)
WITH ( format = 'PARQUET', external_location = 's3a://inbox-public/' )
;
  • Verify the tables were properly rendered:

describe default.orders;
  • View the data with a query:

    • Here is an example query one might use:

select * from default.orders limit 5;