Deployment Size Options for TDP v3.5.x

Four standard EnvironmentSize settings are available for Tetra Data Platform (TDP) version 3.5.x: Small, Medium, Large, and X-Large. Each size setting can process a different number of files. Custom environment sizes are also available on request.

The following T-Shirt Sizing for TDP v3.5.x Environments table shows the approximate number of files that each standard environment size can process. For more information, contact your customer success manager (CSM).

📘

NOTE

TDP v3.5.x includes significant performance increases from previous TDP versions. To explore optimizing your TDP deployment to improve its performance further, contact your CSM.

T-Shirt Sizing for TDP v3.5.x Environments

Key Performance Indicator (KPI)DefinitionSmall*Medium*Large*X-Large
File capacity The maximum file count that has been validated for each TDP environment size. File counts include all file types and versions stored in Amazon Simple Storage Service (Amazon S3).~5 million files~10 million files~20 million files~100 million files
Concurrent workflowsThe number of workflows that can be run in parallel during a given amount of time.

Note: There's a one- to two- hour ramp-up period for workflows to reach their peak state.
240 workflows480 workflows960 workflows** 960 workflows**
File registration rate per hourThe rate that files can be processed by the TDP each hour.~60K files~90K files~130K files~180K files

* Custom environment sizes are available based on specific workloads. Contact your CSM to review your platform configurations.

** Your virtual private cloud (VPC) must have enough IPs to spin up the required number of containers.

Performance Testing

To create the TDP v3.5.x environment-sizing estimates, 100,000 Empower 5MB RAW JSON files were tested by using the common/empower-raw-to-ids:v3.10.3 protocol.

Performance tests were designed to measure the TDP's KPIs independent of file size and the complexity of any specific task script. This approach was chosen because workflow completion time can vary based on file size and task script definitions.

We selected the Empower to IDS generation pipeline to validate performance, because it's one of the the most commonly used pipelines. The runtime of the test pipeline was controlled by limiting the size of the intermediate data schema (IDS) files.

📘

NOTE

KPIs for specific workflows will vary based on the number of steps in each associated task script and the runtime time of each step.