Deployment Size Options for TDP v4.0.x

Suggest Edits

Four standard EnvironmentSize settings are available for Tetra Data Platform (TDP) versions 4.0.x: Small, Medium, Large, and X-Large. Each size setting can process a different number of files. Custom environment sizes are also available on request.

The following T-Shirt Sizing for TDP v4.0.x Environments table shows the approximate number of files that each standard environment size can process. For more information, contact your customer success manager (CSM).

📘
NOTE
To explore optimizing your TDP deployment to improve its performance further, contact your CSM.

T-Shirt Sizing for TDP v4.0.x Environments

Key Performance Indicator (KPI)	Definition	Small*	Medium*	Large*	X-Large
File capacity	The maximum file count that has been validated for each TDP environment size. File counts include all file types and versions stored in Amazon Simple Storage Service (Amazon S3).	~20 million files	~100 million files	~200 million files	~500 million files
Concurrent workflows	The number of workflows that can be run in parallel during a given amount of time. Note: There's a one- to two- hour ramp-up period for workflows to reach their peak state.	600 workflows	1,000 workflows	2,000 workflows**	2,000 workflows**
File registration rate per hour	The rate that files can be processed by the TDP each hour.	~100,000 files	~300,000 files	~500,000 files	~700,000 files
Workflow creation rate per hour	The rate at which trigger conditions are checked and respective workflows are created by the TDP each hour.	~100,000 workflows	~200,000 workflows	~350,000 workflows	~350,000 workflows
`SearchEql` API request rate per minute	The number of API requests the `/searchEql` endpoint can handle each minute. Note: Results are based on a 5.7 MB response size.	~100 requests	~150 requests	~380 requests	~500 requests

* Custom environment sizes are available based on specific workloads. Contact your CSM to review your platform configurations.

** Your virtual private cloud (VPC) must have enough IPs to spin up the required number of containers.

Performance Testing

To create the TDP v4.0.x environment-sizing estimates, 100,000 Empower 5MB RAW JSON files were tested by using the common/empower-raw-to-ids:v3.10.3 protocol.

Performance tests were designed to measure the TDP's KPIs independent of file size and the complexity of any specific task script. This approach was chosen because workflow completion time can vary based on file size and task script definitions.

We selected the Empower to IDS generation pipeline to validate performance, because it's one of the the most commonly used pipelines. The runtime of the test pipeline was controlled by limiting the size of the intermediate data schema (IDS) files.

📘
NOTE
KPIs for specific workflows will vary based on the number of steps in each associated task script and the runtime time of each step.

`/SearchEql` API Best Practices for Increasing Performance

To help optimize performance when using the /searchEql endpoint, make sure that you do the following:

Don’t use a wildcard prefix (*) in searches. Queries that include wildcards aren’t as effective, take longer to run, and require more computing resources. For more information, see Wildcard Searches.
Don’t fetch all fields when querying a high number of files.
Keep response sizes near 5 MB each. To help reduce response sizes, it’s recommended that you do the following:
- Fetch limited fields only
- Make sure that you use the FileCategory and IDS Version in your query parameters.

Updated 23 days ago

Deployment Size Options for TDP v4.0.x

📘
NOTE

T-Shirt Sizing for TDP v4.0.x Environments

Performance Testing

📘
NOTE

`/SearchEql` API Best Practices for Increasing Performance

📘NOTE

T-Shirt Sizing for TDP v4.0.x Environments

Performance Testing

📘NOTE

/SearchEql API Best Practices for Increasing Performance

📘
NOTE

📘
NOTE

`/SearchEql` API Best Practices for Increasing Performance