TetraScience

Tetra Data Platform Documentation

Welcome to TetraScience Tetra Data Platform (TDP) documentation site. Here, you'll find Product Documentation, API Documentation, and Release Notes for TDP components.

Release Notes    API Documentation

Required AWS Services

This topic provides a list of currently-required Amazon Web Service (AWS) services.

The Tetra Data Platform (TDP) is deployed from two CloudFormation stacks, each containing multiple nested stacks. The stacks are packaged, versioned, and made available to clients as AWS ServiceCatalog products.

Application code is containerized and runs on Amazon’s Elastic Container Service (ECS). Some code also runs in AWS Lambda. Data is stored in S3; resource-intensive auxiliary services like ElasticSearch, and Postgres Database have dedicated clusters.

The following table provides more detail on the AWS Services currently in use.

📘

NOTE:

If your private subnets where TDP will be deployed are restricted and do not have outbound access to the internet, read VPC Endpoints for important details.

Versions

Versions of all software can be found in the installation scripts.

Upgrades

All of the AWS components listed below are tested together internally by TetraScience. Upgrading to a new Platform release will also upgrade selected AWS components as necessary.

Components

Microservice

Description

For More Information

AWS Athena

An interactive query service that allows you to analyze data in Amazon S3 using SQL.

Set of parsers that convert the data into numerical or tabular data useful for the data scientist. AWS Athena offers a JDBC API to query parquet (converted data). Files from the main data lake bucket are restructured and transformed into an Athena bucket for more efficient SQL queries.

AWS Athena Page

AWS AutoScaling

Monitors applications and automatically adjusts capacity to maintain steady, predictable performance.

AWS AutoScaling Page

AWS Certificate Manager

Provisions, manages, and deploys public and private Secure Sockets Layer/Transport Layer Security (SSL/TLS) certificates for use with AWS services and internal connected resources.

AWS certificate manager is leveraged to deploy and manage certificates.

Certificate Manager

AWS CloudFormation

Models a collection of related AWS and third-party resources, and provisions and manage them throughout their lifecycles, by treating infrastructure as code.

TDP is deployed on two Cloudformation stacks.

CloudFormation

AWS CloudTrail

Service that enables governance, compliance, operational auditing, and risk auditing an AWS account. Logs, continuously monitors, and retains account activity related to actions across a customer's AWS infrastructure.

There is active logging and monitoring implemented using CloudTrail and Cloud Watch.

CloudTrail

AWS CloudWatch

Collects monitoring and operational data in the form of logs, metrics, and events, providing a unified view of AWS resources, applications, and services that run on AWS and on-premise servers.

There is active logging and monitoring implemented using CloudTrail and Cloud Watch. Audit trail information gets saved in CloudWatch.

AWS CloudWatch Page

AWS CodeBuild

Fully managed continuous integration service that compiles source code, runs tests, and produces software packages that are ready to deploy.

CodeBuild

AWS Cognito

Amazon Cognito lets you add user sign-up, sign-in, and access control to your web and mobile apps quickly and easily. Amazon Cognito scales to millions of users and supports sign-in with social identity providers, such as Apple, Facebook, Google, and Amazon, and enterprise identity providers via SAML 2.0 and OpenID Connect.

Only required if using SSO.

AWS Cognito

AWS Config

Service that enables the assessment, auditing, and evaluating of AWS resource configurations.

Config

AWS Container Registry (ECR)

Fully managed container registry that used to store, manage, share, and deploy container images and artifacts.

Connectors are published to ECR as Alpine Docker containers.

Elastic Container Registry

AWS Container Service (ECS)

Fully managed service for deploying, securing, and running Elasticsearch at scale.

ECS is used for platform API services, web application-backend. Each service is defined by a container image in the Cloud Formation template.

Elastic Container Service

AWS Computing Cloud (EC2)

Secure, resizable compute capacity in the AWS cloud.

Certain parsers that require windows platform leverage ephemeral Amazon EC2 machines. These machines run these file parsers, read raw data and write back the parsed file.

Elastic Compute Cloud

AWS ElasticSearch

Fully managed service that deploys, secures, and runs Elasticsearch at scale.

File information received is indexed and stored within the Amazon Elasticsearch.

ElasticSearch

AWS Fargate

Serverless compute engine for containers that works with Amazon Elastic Container Service (ECS).

Fargate manages the orchestration for Data pipeline (parsers and convertors) container images.

Fargate

AWS Glue

Serverless data integration service that discovers, prepares, and combines data for analytics, machine learning, and application development.

Each organization has a separate Glue database. Tables for the organization is created in their database.

Glue

AWS Identity and Access Management (IAM)

Manages access to AWS services and resources securely.

Used to manage AWS users and groups, and permissions to allow and deny their accesses to AWS resources. For example, during organization provisioning, an IAM policy is created to only allow access to an organization's Athena folder.

Identity and Access Management

AWS Internet of Things (IOT)

Suite of services that provides a means to connect. secure, control, and manage devices.

AWS requests from the data connector are encrypted and authenticated using short-lived IAM credentials. These IAM credentials are generated and refreshed using certificates. IAM roles and policies interact with ECR, KMS, Cloudwatch, to grant permissions to upload specific S3 objects.

IoT

AWS Key Management Service

Creates and manages cryptographic keys and controls their use across a wide range of AWS services and in applications.

Each organization is provisioned a KMS key. Organization data is encrypted with this key.

Key Management Service

AWS Lambda

Serverless computing service that lets runs code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes.

Lambdas assume a role necessary to only read from a specific S3 location with your organization's KMS key. Lambdas also receive SNS messages.

Lambda

AWS Relational Database Service (RDS)

Scalable, AWS relational database in the cloud that automates time-consuming administration tasks such as hardware provisioning, database setup, patching and backups.

The information about triggers that are to be followed to process the file, is stored in Postgres RDS database. Lambda, SQS, SNS work together to evaluate triggers.

Relational Database Service

AWS Route 53

Highly available and scalable cloud Domain Name System (DNS) web service that connects user requests to infrastructure running in AWS – such as Amazon EC2 instances, Elastic Load Balancing load balancers, or Amazon S3 buckets. Route 53 can also be used to route users to infrastructure outside of AWS.

AWS Route 53 is used internally, to allow platform microservices to discover each other and communicate.
Optionally, external facing entries can be created for the platform endpoints, but this feature can be turned off.

Route53

AWS Secrets Manager

Protects secrets needed to access applications, services, and IT resources.

The credentials to invoke external APIs, access RDS instances, etc., are securely stored within the AWS secrets manager.

AWS Secrets Manager

AWS Security Token Service (STS)

Service that provides temporary, limited-privilege credentials for AWS Identity and Access Management (IAM) users or other authenticated users.

Certificates used in API calls periodically request temporary AWS credentials via the AWS STS.

Security Token Service

AWS Service Catalog

Used to create and manage catalogs of IT services that are approved for use on AWS.

In TDP, the information regarding the installed platform, services and upgrades are available within the service catalog.

Service Catalog

AWS Simple Email Service (SES)

Cost-effective, flexible, and scalable email service that enables developers to send mail from within any application.

Email notifications can be configured with the use of SES to handle situations where data export to outside destination or pipeline has failed.

Simple Email Service

AWS Simple Storage Service

Object storage service that offers scalability, data availability, security, and performance.

TetraScience artifacts (RAW and processed files, along with other artifacts) are stored in S3.

Simple Storage Service

AWS Simple Queue Service (SQS)

Fully managed message queuing service that decouples and scales microservices, distributed systems, and serverless applications. Sends, stores, and receives messages between software components at any volume, without losing messages or requiring other services to be available.

In TDP, Lambda, SQS, SNS work together to evaluate triggers.

Simple Queue Service

AWS Simple Notification Service (SNS)

Fully managed messaging service for both application-to-application (A2A) and application-to-person (A2P) communication

In TDP, Lambda, SQS, SNS work together to evaluate triggers.

Simple Notification Service

AWS Systems Manager (SSM)

Unified user interface used to view operational data from multiple AWS services and automate operational tasks across AWS resources.

The SSM is leveraged to establish communication from platform to the Tetra Datahub. The credentials used for Data Lake authentication is communicated to the data connectors via the SSM. For example, the SSM is needed to add a new connector and update the connector's configuration.

Systems Manager

AWS Virtual Private Cloud (VPC)

Service that launches AWS resources in a logically isolated virtual network.

Virtual Private Cloud

Updated 5 months ago


Required AWS Services


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.