AWS Services
The Tetra Data Platform (TDP) is deployed from two AWS CloudFormation stacks, each containing multiple nested stacks. The stacks are packaged, versioned, and made available to clients as AWS Service Catalog products.
Application code is containerized and runs on Amazon Elastic Container Service (ECS). Some code also runs in AWS Lambda. Data is stored in Amazon Simple Storage Service (Amazon S3). Resource-intensive auxiliary services like OpenSearch and Postgres Database have dedicated clusters.
VPC Endpoints
If your private subnets [where the TDP is deployed] are restricted, and do not have outbound access to the internet, see VPC Endpoints for important details.
Versions
You can find all software versions within the installation scripts.
Supported AWS Regions
These AWS Regions are supported for deployment:
Region Name | Location |
---|---|
ap-northeast-1 | Asia Pacific (Tokyo) |
eu-central-1 | Europe (Frankfurt) |
eu-west-1 | Europe (Ireland) |
us-east-1 | US East (N. Virginia) |
us-east-2 | US East (Ohio) |
us-west-2 | US West (Oregon) |
Upgrades
All of the AWS components listed in the table are tested together internally by TetraScience. Upgrading to a new TDP version will also upgrade selected AWS components as needed.
AWS Services
The following AWS services are required to run the TDP.
Microservice | Description | For More Information |
---|---|---|
Amazon Athena | An interactive query service that allows you to analyze data in Amazon S3 using SQL. Set of parsers that convert the data into numerical or tabular data useful for the data scientist. | Amazon Athena |
Amazon AppStream 2.0 | (For customers activating Tetra Data & AI Workspace only) A fully managed application streaming service that provides users with instant access to their desktop applications from anywhere. | Amazon AppStream 2.0 Administration Guide |
Amazon Cognito | An identity platform for web and mobile apps. It’s a user directory, an authentication server, and an authorization service for OAuth 2.0 access tokens and AWS credentials. | Amazon Cognito Developer Guide |
AWS Auto Scaling | Monitors applications and automatically adjusts capacity to maintain steady, predictable performance. | AWS AutoScaling Page |
AWS CloudFormation | Models a collection of related AWS and third-party resources, and provisions and manages them throughout their lifecycles, by treating infrastructure as code. TDP is deployed on two CloudFormation stacks. | CloudFormation |
AWS CloudTrail | Service that enables governance, compliance, operational auditing, and risk auditing of an AWS account. Logs, continuously monitors, and retains account activity related to actions across a customer's AWS infrastructure. There is active logging and monitoring implemented using CloudTrail and CloudWatch. | CloudTrail |
Amazon CloudWatch | Collects monitoring and operational data in the form of logs, metrics, and events, providing a unified view of AWS resources, applications, and services that run on AWS and on-premise servers. There is active logging and monitoring implemented using CloudTrail and CloudWatch. Audit trail information is saved in CloudWatch. | Amazon CloudWatch |
AWS CodeBuild | Fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy. | CodeBuild |
Amazon Cognito | Amazon Cognito lets you add user sign-up, sign-in, and access control to your web and mobile apps easily. Amazon Cognito scales to millions of users and supports sign-in with social identity providers, such as Apple, Facebook, Google, and Amazon, and enterprise identity providers via SAML 2.0 and OpenID Connect. Only required if using SSO. | Amazon Cognito |
Amazon Elastic Container Registry (ECR) | Fully managed container registry that is used to store, manage, share, and deploy container images and artifacts. Connectors are published to ECR as Alpine Docker containers. | Elastic Container Registry |
Amazon Elastic Container Service (ECS) | Fully managed service for deploying, securing, and running Elasticsearch at scale. ECS is used for platform API services, web application-backend. Each service is defined by a container image in the Cloud Formation template. | Elastic Container Service |
Amazon ECS discovery service | (For customers activating Tetra Data & AI Workspace only) Amazon ECS discovery service uses AWS Cloud Map API actions to manage HTTp and DNS namespaces for Amazon ECS services. | Amazon ECS discovery service |
Amazon Elastic Compute Cloud (Amazon EC2) | Secure, resizable compute capacity in the AWS Cloud. Certain parsers that require windows platform leverage ephemeral Amazon EC2 machines. These machines run these file parsers, read raw data and write back the parsed file. | Amazon EC2 |
AWS Database Migration Service (DMS) | A cloud service that makes it possible to migrate relational databases, data warehouses, NoSQL databases, and other types of data stores. You can use AWS DMS to migrate your data into the AWS Cloud or between combinations of cloud and on-premises setups. | AWS Database Migration Service User Guide |
Amazon ElastiCache | Makes it easy to set up, manage, and scale distributed in-memory cache environments in the AWS Cloud. It provides a high performance, resizable, and cost-effective in-memory cache, while removing complexity associated with deploying and managing a distributed cache environment. | Amazon ElastiCache Documentation |
Amazon EventBridge Pipes | Intended for point-to-point integrations between supported sources and targets, with support for advanced transformations and enrichment. | Amazon EventBridge User Guide |
AWS Fargate | Serverless compute engine for containers that works with Amazon Elastic Container Service (ECS). Fargate manages the orchestration for Data pipeline (parsers and convertors) container images. | AWS Fargate |
AWS Glue | Serverless data integration service that discovers, prepares, and combines data for analytics, machine learning, and application development. Each organization has a separate Glue database. Tables for the organization is created in their database. | AWS Glue |
AWS Identity and Access Management (IAM) | Manages access to AWS services and resources securely. Used to manage AWS users and groups, and permissions to allow and deny their accesses to AWS resources. For example, during organization provisioning, an IAM policy is created to only allow access to an organization's Athena folder. | IAM |
AWS IoT Core | Suite of services that provides a means to connect, secure, control, and manage devices. AWS requests from the data connector are encrypted and authenticated using short-lived IAM credentials. These IAM credentials are generated and refreshed using certificates. IAM roles and policies interact with ECR, KMS, CloudWatch, to grant permissions to upload specific S3 objects. | AWS IoT |
AWS Key Management Service | Creates and manages cryptographic keys and controls their use across a wide range of AWS services and in applications. Each organization is provisioned a KMS key. Organization data is encrypted with this key. | AWS KMS |
Amazon Kinesis Data Streams | Collects and processes large streams of data records in real time. | Amazon Kinesis Data Streams Developer Guide |
AWS Lambda | Serverless computing service that enables you to run code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes. Lambdas assume a role necessary to only read from a specific Amazon S3 location with your organization's AWS KMS key. Lambda functions also receive Amazon SNS messages. | AWS Lambda |
Amazon OpenSearch Service | A managed service that makes it easy to deploy, operate, and scale OpenSearch clusters in the AWS Cloud. | Amazon OpenSearch Service Developer Guide |
Amazon Relational Database Service (Amazon RDS) | Scalable, relational database in the cloud that automates time-consuming administration tasks such as hardware provisioning, database setup, patching, and backups. The trigger information, used to initiate file processing, is stored in a Postgres RDS database. Lambda, SQS, and SNS work together to evaluate triggers. | Amazon RDS |
Amazon Route 53 | Highly available and scalable cloud Domain Name System (DNS) web service that connects user requests to infrastructure running in AWS – such as Amazon EC2 instances, Elastic Load Balancing load balancers, or Amazon S3 buckets. Route 53 can also be used to route users to infrastructure outside of AWS. AWS Route 53 is used internally, to allow platform microservices to discover each other and communicate. Optionally, external facing entries can be created for the platform endpoints, but this feature can be turned off. | Amazon Route53 |
AWS Security Token Service (AWS STS) | Service that provides temporary, limited-privilege credentials for AWS Identity and Access Management (IAM) users or other authenticated users. Certificates used in API calls periodically request temporary AWS credentials via the AWS STS. | AWS STS |
AWS Service Catalog | Used to create and manage catalogs of IT services that are approved for use on AWS. In TDP, the information regarding the installed platform, services and upgrades are available within the service catalog. | AWS Service Catalog |
Amazon Simple Email Service (SES) | Cost-effective, flexible, and scalable email service that enables developers to send mail from within any application. Email notifications can be configured with the use of SES to handle situations where data export to outside destination or pipeline has failed. | Amazon SES |
Amazon Simple Storage Service (Amazon S3) | Object storage service that offers scalability, data availability, security, and performance. TetraScience artifacts (RAW and processed files, along with other artifacts) are stored in S3. | Amazon S3 |
Amazon Simple Queue Service (Amazon SQS) | Fully managed message queuing service that decouples and scales microservices, distributed systems, and serverless applications. Sends, stores, and receives messages between software components at any volume, without losing messages or requiring other services to be available. In TDP, Lambda, SQS, SNS work together to evaluate triggers. | Amazon SQS |
Amazon Simple Notification Service (Amazon SNS) | Fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication In TDP, Lambda, SQS, and SNS work together to evaluate triggers. | Amazon SNS |
AWS Step Functions | (For customers activating Tetra Data & AI Workspace only) AWS Step Functions provides the ability to create workflows (state machines) to build distributed applications, automate processes, orchestrate microservices, and create data and machine learning pipelines. | AWS Step Functions |
AWS Systems Manager | Unified user interface used to view operational data from multiple AWS services and automate operational tasks across AWS resources. AWS Systems Manager is leveraged to establish communication from platform to the Tetra Hub or Data Hub. The credentials used for Data Lake authentication is communicated to the data connectors through AWS Systems Manager. For example, AWS Systems Manager is needed to add a new connector and update the connector's configuration. | AWS Systems Manager |
Amazon Time Sync Service | Provided by Amazon. It is accessible from all EC2 instances and is also used by other AWS services. This service uses a fleet of satellite-connected and atomic reference clocks in each region to deliver accurate current time readings of the Coordinated Universal Time (UTC) global standard through the Network Time Protocol (NTP). | Amazon Time Sync Service |
Amazon Virtual Private Cloud (Amazon VPC) | Service that launches AWS resources in a logically isolated virtual network. | Amazon VPC |
Updated 5 months ago