SQL Queries
You can use SQL and Amazon Athena queries to explore or analyze tabular data across one or more datasets in the Tetra Scientific Data and AI Cloud.
NOTE
If you're taking part in the Data Lakehouse Architecture early adopter program (EAP), you can also use SQL to query Lakehouse tables (Delta Tables), including the Tetra Data Platform (TDP) system table database. To look for specific datasets and corresponding files by using a search engine, see Search.
How SQL Queries Work in the Tetra Scientific Data and AI Cloud
The Tetra Data Platform (TDP) uses Amazon Athena to manage SQL tables and SQL queries. Athena is an interactive query service that makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) by using standard SQL syntax.
Files Available for SQL Queries
Only Tetra Data that goes through the following engineering and harmonization process is available through SQL queries:
- Tetra Integrations collect RAW data and upload it to the Tetra Scientific Data and AI Cloud.
- Data is standardized into Intermediate Data Schemas (IDSs) or Lakehouse tables (Delta Tables).
After your data is harmonized and converted to IDS or Lakehouse table (Delta Table) format, contents are automatically processed to populate databases and tables using a schema created and managed by TetraScience. You can then run standard SQL ad hoc queries against these tables and retrieve results.
Run SQL Queries
There are two ways to query your data stored in Amazon Athena tables or Lakehouse tables (Delta Tables):
- The SQL Search Page in the TDP: With this method, no connection details or third-party tool set up is required. After the TDP has been configured for your organization, you are ready to run any standard SQL query.
- Use a third-party tool (for example, Tableau to query data): This method is useful when you want to query the data and visualize it to perform further analysis.
IMPORTANT
If you're using a third-party tool to query Athena tables, you must declare which database that you want to connect to up front.
To declare your database up front, make sure that you use the table names as they appear in the TDP. If you don't declare the database up front, then you must add the database name in your query in the following format:
org_slug.tablename
(for example:demo_uat.lcuv_empower_v2_injection
).
Learn More
For more information and best practices, see Data Analytics in the TetraConnect Hub. To request access, see Access the TetraConnect Hub .
Updated about 2 months ago