TDP v4.3.0 Release Notes
Release date: 28 May 2025
TetraScience has released its next version of the Tetra Data Platform (TDP), version 4.3.0. This release makes the new Data Lakehouse Architecture generally available. The new, customer-controlled data storage and management architecture provides customers 50% to 300% faster SQL query performance and an AI/ML-ready data storage format that operates seamlessly across all major data and cloud platform vendors.
This release also introduces lower latency pipelines, makes Snowflake data sharing generally available, adds AWS WAF rule support, and includes other functional and performance improvements.
Here are the details for what’s new in TDP v4.3.0.
New Functionality
New functionalities are features that weren’t previously available in the TDP.
GxP Impact Assessment
All new TDP functionalities go through a GxP impact assessment to determine validation needs for GxP installations.
New Functionality items marked with an asterisk (*) address usability, supportability, or infrastructure issues, and do not affect Intended Use for validation purposes, per this assessment.
Enhancements and Bug Fixes do not generally affect Intended Use for validation purposes.
Items marked as either beta release or early adopter program (EAP) are not validated for GxP by TetraScience. However, customers can use these prerelease features and components in production if they perform their own validation.
Data Access and Management New Functionality
Data Lakehouse Architecture is Now Generally Available
Previously available as part of an early adopter program (EAP), the Data Lakehouse Architecture is now generally available to all customers. The new data storage and management architecture provides 50% to 300% faster SQL query performance and an AI/ML-ready data storage format that operates seamlessly across all major data and cloud platform vendors.
Key Benefits
- Share data directly with any Databricks and Snowflake account
- Run SQL queries faster and more efficiently
- Create AI/ML-ready datasets while reducing data preparation time
- Reduce data storage costs
- Configure Tetraflow pipelines to read multiple data sources and run at specific times
What’s New in TDP v4.3.0
- Customers can convert their data to Lakehouse tables themselves: Customers can now convert their own data into Lakehouse tables by using the
ids-to-lakehouse
protocol in Tetra Data Pipelines. Previously, converting data into Lakehouse tables needed to be done in coordination with customer engineering support. - Faster, more predictable, and lower-cost data latency at scale: With lower compute cost, customers can now consistently convert their Intermediate Data Schema (IDS) data into Lakehouse tables in about 20 minutes. Previously, converting data could take up to 45 minutes. More infrastructure improvements that will continue to reduce data latency are planned for future releases.
- Improved schema structure: Lakehouse tables are now written to nested, IDS-specific schemas. The new nested schema structure makes it so that customers no longer have to navigate through hundreds of potential tables under one schema.
- Snowflake query performance parity: Lakehouse tables are now written to be interoperable with the Iceberg format. This ensures that Snowflake data sharing provides the same query performance as running queries against Lakehouse tables in the TDP.
How It Works
A data lakehouse is an open data management architecture that combines the benefits of both data lakes (cost-efficiency and scale) and data warehouses (management and transactions) to enable analytics and AI/ML on all data. It is a highly scalable and performant data storage architecture that breaks data silos and allows seamless, secure data access to authorized users.
TetraScience is adopting the ubiquitous Delta table format to transform data into refined, cleaned, and harmonized data, while empowering customers to create curated datasets as needed. This process is referred to as the “Medallion” architecture, which is outlined in the Databricks documentation.
For more information, see the Data Lakehouse Architecture documentation.

IMPORTANT
When upgrading to TDP v4.3.0, keep in mind the following:
- Customers taking part in the EAP version of the Data Lakehouse will need to reprocess their existing Lakehouse data to use the new GA version of the Lakehouse tables. To backfill historical data into the updated Lakehouse tables, customers should create a Bulk Pipeline Process Job that uses an
ids-to-lakehouse
pipeline and is scoped in the Date Range field by the appropriate IDSs and historical time ranges.- For single tenant, customer-hosted deployments to use the Data Lakehouse Architecture, customers must still work with TetraScience to enable the TetraScience Databricks integration.
For more information, contact your customer success manager (CSM).
Snowflake Data Sharing is Now Generally Available*
Previously available as part of an early adopter program (EAP), the ability to access Tetra Data through Snowflake is now generally available to all customers.
To access Tetra Data through Snowflake, data sharing must be set up in coordination with TetraScience. A Business Critical Edition Snowflake account is also required in the same AWS Region as your TDP deployment.
For more information, see Use Snowflake to Access Tetra Data.
TDP System Administration New Functionality
Optional Support for AWS WAF in Customer-Hosted Environments*
Customers hosting the TDP in their own environment now have the option to add an AWS WAF (Web Application Firewall) component in front of the public-facing Application Load Balancer (ALB). This is an optional security enhancement and requires specific rule group exceptions to allow seamless operation of the platform’s APIs.
For more information, see AWS WAF Rule Exceptions.
Enhancements
Enhancements are modifications to existing functionality that improve performance or usability, but don't alter the function or intended use of the system.
Data Integrations Enhancements
Configuration Files for Tetra File-Log Agents Are Now More Comprehensive
Tetra File-Log Agent configuration files downloaded from the TDP’s Agents page now contain all of an Agent’s configuration details. These new details include proxy information, Advanced Settings, S3 Direct Upload settings, command queue settings, path configurations, and more. Configuration files for other Agent types that support the Download Configuration feature already include this information.
For more information, see Download Agent Configuration Settings.
SQL Tables Now Include Agent Details
A new {orgslug}**tss**system.agents
SQL table now provides the following information about Tetra Agents in Amazon Athena :
name
(string)id
(string)is_enabled
(boolean)type
(string)org_slug
(string)
This update makes these Agent details available to the new Health Monitoring App.
Agent Command Queues Are Now Enabled in the TDP by Default
To make it easier to communicate with on-premises Tetra Agents programmatically, command queues for all new Agents are now enabled in the TDP by default.
To start using the feature, customers now only need to enable the Receive Commands setting in the local Agent Management Console when installing an Agent. Customers can still turn off an Agent’s command queue at any time.
For more information, see Command Service.
Data Harmonization and Engineering Enhancements
Lower Latency Pipelines
To help support latency-sensitive lab data automation use cases, customers can now select a new Instant Start pipeline execution mode when configuring Python-based Tetra Data Pipelines.
This new compute type offers ~1-second startup time after the initial deployment, regardless of how recently the pipeline was run. Customers can also select from a range of instance sizes in this new compute class to handle data at the scale their use case requires. There’s little to no additional cost impact.
For more information, see Memory and Compute Settings.

Instant Start pipeline memory setting
Increased Bulk Label Operations Limit
Customers can now run up to 200 operations when editing labels in bulk. Previously, only five operations were allowed for each bulk label change job.
For more information, see Edit Labels in Bulk.
Data Access and Management Enhancements
New optimizerResult
Field for /searchEql
API Endpoint Responses
optimizerResult
Field for /searchEql
API Endpoint ResponsesA new, optional optimizerResult
field appears as part of the /searchEql API endpoint response. The field displays an optimized query and index string that customers can either choose to use or not. Customers' original queries are not modified. The new field provides query optimization suggestions only.
For more information, see the [Search files via Elasticsearch Query Language](Search files via Elasticsearch Query Language) API documentation.
Data App Configurations Now Persist When Upgrading to New App Versions
Any changes made in a Tetra Data App configuration now persist when customers upgrade to a new app version. This enhancement is enabled by the addition of Amazon Elastic File System (Amazon EFS) storage. Amazon EFS offers persistent storage for configurations, so any changes made are not lost when upgrading Data App versions.
Amazon EFS is not enabled by default. Only data apps that explicitly request storage support now have Amazon EFS storage.
For more information, see AWS Services.
TDP System Administration Enhancements
Automatic Notifications for Service User Tokens
System administrators now have the option to create automated email notifications to inform specific users when one of their organization’s Service User JSON Web Tokens (JWTs) is about to expire.
For more information, see Add a Service User and Edit a Service User.
New Description Field for Service Users
System administrators now have the option to enter a Description when creating or editing Service Users. This field can provide an indication of what the Service User is for.
For more information, see Add a Service User and Edit a Service User.
Beta Release and EAP Feature Enhancements
New Infrastructure to Support Self-Service Connectors EAP
With the upcoming release of the TetraScience Connectors SDKs planned for June 2025, customers will be able to create their own self-service Tetra Connectors as part of an early adopter program (EAP).
TDP v4.3.0 introduces the required infrastructure to support self-service Connectors once the new Connector SDKs are available.
Improved Accuracy for Data Retention Policies
Files deleted by Data Retention Policies are now deleted based on when they became available for search in the TDP (uploaded_at
), rather than when they were initially uploaded (inserted_at
).
This update resolves an issue where some files were not deleted as expected because of the delay between when files are uploaded and when they’re indexed for search.
Infrastructure Updates
The following is a summary of the TDP infrastructure changes made in this release. For more information about specific resources, contact your CSM or account manager.
New Resources
- AWS services added: 0
- AWS Identity and Access Management (IAM) roles added: 12
- IAM policies added: 12 (inline policies within the roles)
- AWS managed policies added: 1 (AWSLambdaBasicExecutionRole)
Removed Resources
- IAM roles removed: 3
- IAM policies removed: 3 (inline policies within removed roles)
- AWS managed policies removed: 0
New VPC Endpoints for Customer-Hosted Deployments
For customer-hosted environments, if the private subnets where the TDP is deployed are restricted and don't have outbound access to the internet, then the VPC now needs the following AWS Interface VPC endpoints enabled:
com.amazonaws.<REGION>.servicecatalog
: enables AWS Service Catalog, which is used to create and manage catalogs of IT services that are approved for AWScom.amazonaws.<REGION>.states
: enables AWS Step Functions workflows (State machines), which are used to automate processes and orchestrate microservicescom.amazonaws.<REGION>.tagging
: enables tagging for AWS resources, which hold metadata about each resource
These new endpoints must be enabled along with the other required VPC endpoints.
For more information, see VPC Endpoints.
Bug Fixes
The following bugs are now fixed.
Data Access and Management Bug Fixes
- Files with fields that end with a backslash (
\
) escape character can now be uploaded to the Data Lakehouse.
Data Harmonization and Engineering Bug Fixes
- On the Pipeline Edit page, in the Retry Behavior field, the
Always retry 3 times (default)
value no longer appears as a null value.
TDP System Administration Bug Fixes
- Error messages that appear when an Admin tries to generate SQL credentials for a TDP user that doesn’t have SQL Search permissions now indicate what’s causing the error.
Deprecated Features
There are no new deprecated features in this release.
For more information about TDP deprecations, see Tetra Product Deprecation Notices.
Known and Possible Issues
The following are known and possible issues for TDP v4.3.0.
Data Harmonization and Engineering Known Issues
- IDS files larger than 2 GB are not indexed for search.
- The Chromeleon IDS (thermofisher_chromeleon) v6 Lakehouse tables aren't accessible through Snowflake Data Sharing. There are more subcolumns in the table’s
method
column than Snowflake allows, so Snowflake doesn’t index the table. A fix for this issue is in development and testing and is scheduled for a future release. - SQL queries run against the Lakehouse tables generated from Cytiva AKTA IDS (
akta
) SQL tables aren’t backwards compatible with the legacy Amazon Athena SQL table queries. Whenakta
IDS SQL tables are converted into Lakehouse tables, the following table and column names are updated:- The source
akta_v_run
Amazon Athena SQL table is replaced with anakta_v_root
Lakehouse table. - The
akta_v_run.time
column in the source Amazon Athena SQL tables is renamed toakta_v_root.run_time
. - The
akta_v_run.note
column in the source Amazon Athena SQL tables is renamed toakta_v_root.run_note
.
These updated table and column names must be added to any queries run against the newakta
IDS Lakehouse tables. A fix for this issue is in development and testing and is scheduled for a future release.
- The source
- Empty values in Amazon Athena SQL tables display as
NULL
values in Lakehouse tables. - File statuses on the File Processing page can sometimes display differently than the statuses shown for the same files on the Pipelines page in the Bulk Processing Job Details dialog. For example, a file with an
Awaiting Processing
status in the Bulk Processing Job Details dialog can also show aProcessing
status on the File Processing page. This discrepancy occurs because each file can have different statuses for different backend services, which can then be surfaced in the TDP at different levels of granularity. A fix for this issue is in development and testing. - Logs don’t appear for pipeline workflows that are configured with retry settings until the workflows complete.
- Files with more than 20 associated documents (high-lineage files) do not have their lineage indexed by default. To identify and re-lineage-index any high-lineage files, customers must contact their CSM to run a separate reconciliation job that overrides the default lineage indexing limit.
- OpenSearch index mapping conflicts can occur when a client or private namespace creates a backwards-incompatible data type change. For example: If
doc.myField
is a string in the common IDS and an object in the non-common IDS, then it will cause an index mapping conflict, because the common and non-common namespace documents are sharing an index. When these mapping conflicts occur, the files aren’t searchable through the TDP UI or API endpoints. As a workaround, customers can either create distinct, non-overlapping version numbers for their non-common IDSs or update the names of those IDSs. - File reprocessing jobs can sometimes show fewer scanned items than expected when either a health check or out-of-memory (OOM) error occurs, but not indicate any errors in the UI. These errors are still logged in Amazon CloudWatch Logs. A fix for this issue is in development and testing.
- File reprocessing jobs can sometimes incorrectly show that a job finished with failures when the job actually retried those failures and then successfully reprocessed them. A fix for this issue is in development and testing.
- File edit and update operations are not supported on metadata and label names (keys) that include special characters. Metadata, tag, and label values can include special characters, but it’s recommended that customers use the approved special characters only. For more information, see Attributes.
- The File Details page sometimes displays an Unknown status for workflows that are either in a Pending or Running status. Output files that are generated by intermediate files within a task script sometimes show an Unknown status, too.
Data Access and Management Known Issues
- The Tetra Data & AI Workspace doesn’t load for users that have only Data User policy permissions. A fix for this issue is in development and testing and is scheduled for a future release.
- When customers upload a new file on the Search page by using the Upload File button, the page doesn’t automatically update to include the new file in the search results. As a workaround, customers should refresh the Search page in their web browser after selecting the Upload File button. A fix for this issue is in development and testing and is scheduled for a future TDP release.
- Values returned as empty strings when running SQL queries on SQL tables can sometimes return
Null
values when run on Lakehouse tables. As a workaround, customers taking part in the Data Lakehouse Architecture EAP should update any SQL queries that specifically look for empty strings to instead look for both empty string andNull
values. - The Tetra FlowJo Data App doesn’t load consistently in all customer environments.
- Query DSL queries run on indices in an OpenSearch cluster can return partial search results if the query puts too much compute load on the system. This behavior occurs because the OpenSearch
search.default_allow_partial_result
setting is configured astrue
by default. To help avoid this issue, customers should use targeted search indexing best practices to reduce query compute loads. A way to improve visibility into when partial search results are returned is currently in development and testing and scheduled for a future TDP release. - Text within the context of a RAW file that contains escape (
\
) or other special characters may not always index completely in OpenSearch. A fix for this issue is in development and testing, and is scheduled for an upcoming release. - If a data access rule is configured as [label] exists > OR > [same label] does not exist, then no file with the defined label is accessible to the Access Group. A fix for this issue is in development and testing and scheduled for a future TDP release.
- File events aren’t created for temporary (TMP) files, so they’re not searchable. This behavior can also result in an Unknown state for Workflow and Pipeline views on the File Details page.
- When customers search for labels in the TDP UI’s search bar that include either @ symbols or some unicode character combinations, not all results are always returned.
- The File Details page displays a
404
error if a file version doesn't comply with the configured Data Access Rules for the user.
TDP System Administration Known Issues
- The latest Connector versions incorrectly log the following errors in Amazon CloudWatch Logs:
Error loading organization certificates. Initialization will continue, but untrusted SSL connections will fail.
Client is not initialized - certificate array will be empty
These organization certificate errors have no impact and shouldn’t be logged as errors. A fix for this issue is currently in development and testing, and is scheduled for an upcoming release. There is no workaround to prevent Connectors from producing these log messages. To filter out these errors when viewing logs, customers can apply the following CloudWatch Logs Insights query filters when querying log groups. (Issue #2818)
CloudWatch Logs Insights Query Example for Filtering Organization Certificate Errors
fields @timestamp, @message, @logStream, @log | filter message != 'Error loading organization certificates. Initialization will continue, but untrusted SSL connections will fail.' | filter message != 'Client is not initialized - certificate array will be empty' | sort @timestamp desc | limit 20
- If a reconciliation job, bulk edit of labels job, or bulk pipeline processing job is canceled, then the job’s ToDo, Failed, and Completed counts can sometimes display incorrectly.
Upgrade Considerations
During the upgrade, there might be a brief downtime when users won't be able to access the TDP user interface and APIs.
After the upgrade, the TetraScience team verifies that the platform infrastructure is working as expected through a combination of manual and automated tests. If any failures are detected, the issues are immediately addressed, or the release can be rolled back. Customers can also verify that TDP search functionality continues to return expected results, and that their workflows continue to run as expected.
For more information about the release schedule, including the GxP release schedule and timelines, see the Product Release Schedule.
For more details on upgrade timing, customers should contact their CSM.
Upgrading from the EAP version of the Data Lakehouse
Customers taking part in the EAP version of the Data Lakehouse will need to reprocess their existing Lakehouse data to use the new GA version of the Lakehouse tables. To backfill historical data into the updated Lakehouse tables, customers should create a Bulk Pipeline Process Job that uses an
ids-to-delta
pipeline and is scoped in the Date Range field by the appropriate IDSs and historical time ranges.
Security
TetraScience continually monitors and tests the TDP codebase to identify potential security issues. Various security updates are applied to the following areas on an ongoing basis:
- Operating systems
- Third-party libraries
Quality Management
TetraScience is committed to creating quality software. Software is developed and tested following the ISO 9001-certified TetraScience Quality Management System. This system ensures the quality and reliability of TetraScience software while maintaining data integrity and confidentiality.
Other Release Notes
To view other TDP release notes, see Tetra Data Platform Release Notes.
Updated 1 day ago