Data Acquisition Security

This page describes the data flow, access management policies, and encryption details for the Tetra Data Platform (TDP) and how the platform handles data acquisition security.

Data Isolation

The TDP uses several Amazon Simple Storage (Amazon S3) buckets to store RAW and transformed/standardized data and data pipeline artifacts. Each Amazon S3 bucket is shared among all organizations, however the top-level orgSlug key is used to partition and isolate data for each organization. This isolation is enforced through an AWS Identity and Access Management (IAM) policy.

Amazon S3 Bucket Limits

By default, AWS only allows 100 buckets per account. While this limit can be increased per request, it should also be noted that Amazon S3 bucket names are globally unique across all AWS customers. Even if buckets are created through CloudFormation or some other automated mechanism, maintaining these resources to ensure that these endpoints are secure and avoid naming collisions is a burden. Additionally, the IAM policies that you would write for a specific bucket are fundamentally the same as what TetraScience has provided on a folder level.

Encryption

Every organization on the TDP is automatically provisioned with a separate AWS KMS (Key Management Service) key. AWS KMS uses the Advanced Encryption Standard (AES) algorithm in Galois/Counter Mode (GCM), known as AES-GCM. AWS KMS uses this algorithm with 256-bit secret keys. Each KMS key automatically rotates yearly.

Server-Side Encryption with AWS KMS-Managed Keys (SSE-KMS) is used to encrypt all data at rest.

Sensitive configuration parameters, such as usernames and passwords for Data Connectors are stored in AWS Systems Manager Parameter Store and are encrypted with your organization-specific KMS key. Parameter Store is a service that provides secure, hierarchical storage for configuration data management and secrets management.

Access to encrypt and decrypt with your organization’s KMS key is only granted through AWS Identity and Access Management (IAM) roles and policies.

Identity and Access Management

Security and access controls for data and AWS resources are enforced through AWS Identity and Access Management (IAM) policies and roles. All IAM users, policies, and roles for your organization are generated automatically through AWS CloudFormation. All infrastructure changes, including IAM resources, can only be modified through our version control, code review and TetraScience’s automated build system.

IAM Policy Examples

The following are example IAM policies for IAM users, Tetra Hubs, and Data Hubs.

IAM User Policy Example

This example IAM user policy provides permissions for the following actions:

  1. Access data only for your organization within a predefined TetraScience Amazon S3 bucket
  2. Decrypt those Amazon S3 objects with your organization-specific AWS KMS key
  3. Access the AWS Glue database and tables for your organization
  4. Read results written by Amazon Athena to a predefined folder within the Athena-results bucket that matches the orgSlug. The folder name must match the org slug.

To learn how to attach the policy to the existing IAM user, see Adding and removing IAM identity permissions in the AWS documentation.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "S3Access",
      "Effect": "Allow",
      "Action": [
        "s3:Get*"
      ],
      "Resource": [
        "arn:aws:s3:::TS_ATHENA_BUCKET/ORG_SLUG/**"
      ]
    },
    {
      "Sid": "S3ResultsAccess",
      "Effect": "Allow",
      "Action": [
        "s3:Get*"
      ],
      "Resource": [
        "arn:aws:s3:::TS_ATHENA_REUSLTS_BUCKET/ORG_SLUG",
        "arn:aws:s3:::TS ATHENA RESULTS BUCKET/ORG SLUG/**"
      ]
    },
    {
      "Sid": "AccessKMSKey",
      "Effect": "Allow",
      "Action": [
        "kms:Decrypt",
        "kms:GenerateDataKey",
        "kms:DescribeKey"
      ],
      "Resource": [
        "arn:aws:kms:::key/YOUR ORGANIZATION KEY ID"
      ]
    },
    {
      "Effect": "Allow",
      "Action": [
        "athena:GetQueryResults",
        "athena:GetTable",
        "athena:GetTables",
        "athena:RunQuery",
        "athena:StartQueryExecutlon",
        "athena:StopQueryExecut ion"
      ],
      "Resource": [
        "*"
      ]
    },
    {
      "Sid": "GluePermlsslons",
      "Effect": "Allow",
      "Action": [
        "glue:Get*"
      ],
      "Resource": [
        "arn:aws:glue:*:*:catalog",
        "arn:aws:glue:*:*:database/ORG_SLUG",
        "arn:aws:glue:*:*:table/ORG_SLUG/*"
      ]
    }
  ]
}

📘

Note

It is possible to have an IAM user created by the Tetra Data Platform. The operation is configurable at the environment level with the deployment parameter AthenaCreateIamUser.

The IAM policy name is constructed as: ts-athena-<aws-region>-<environment>-<organization-slug>-policy (for example: ts-athena-us-east-2-production-tetrascience-policy)

Tetra Hub and Data Hub IAM Policy Example

This example IAM policy, which applies to both Tetra Hubs and Tetra Data Hub, provides permissions for the following actions:

  1. Upload to a TetraScience-managed Amazon Simple Storage Service (Amazon S3) bucket under your organization’s top-level folder
  2. Encrypt data with your organization’s AWS KMS key
  3. Download Tetra Integration images to collect new data from your environment
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "UploadToS3",
      "Effect": "Allow",
      "Action": [
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::${bucket}/YOUR_ORGANIZATION_ORG_SLUG/*"
      ]
    },
    {
      "Sid": "AccessKMSKey",
      "Effect": "Allow",
      "Action": [
        "kms:Encrypt",
        "kms:GenerateDataKey",
        "kms:DescribeKey"
      ],
      "Resource": [
        "arn:aws:kms:::key/YOUR_ORGANIZATION_KEY_ID"
      ]
    },
    {
      "Sid": "DownloadDataConnectorImages",
      "Effect": "Allow",
      "Action": [
        "ecr:GetDownloadUrlForLayer",
        "ecr:LlstImages",
        "ecr:BatchGetImage",
        "ecr:DescribeImages",
        "ecr:BatchCheckLayerAvallablllty",
        "ecr:GetReposltoryPollcy"
      ],
      "Resource": [
        "arn:aws:ecr:us-east-l:xxxxxxxxxxxx:reposltory/data-connector-*"
      ]
    }
  ]
}

Organization and Infrastructure Provisioning

All infrastructure changes, including organization-specific IAM policies, roles, and users, can only be modified through version control, code review, and TetraScience’s automated build system. Direct changes to infrastructure, resources, or application code are prohibited.

The following diagram shows an example flow of infrastructure changes made through AWS CloudFormation:

1175

Infrastructure Changes through CloudFormation

📘

NOTE

When you create a new organization on the TDP, an orgSlug is created and assigned to your organization. An orgSlug is a unique identifier used to create logical separation for data and data access. If your company is called Example Company, your orgSlug may be something like exampleco. This concept is typically hidden from you and your organization, however, if your organization needs direct access to data in Amazon S3, or access through Amazon Athena, you will see references to your orgSlug.

Tetra Hubs and Data Hubs

Tetra Hub and Data Hub are the on-premises software components of the Tetra Data Platform (TDP). They facilitate secure data transfer between the TDP and Connectors and Agents, which can each pull or receive data from individual data sources. A single Hub can integrate with many Connectors and Agents, allowing it to interact with many data sources.

To transfer data to the TDP securely, Tetra Hub uses Amazon Elastic Container Service (Amazon ECS) as well as AWS Systems Manager. Tetra Data Hub uses AWS Systems Manager and AWS IoT. For more information, see Security Considerations on the Tetra Hub and Data Hub page.

AWS Systems Manager (SSM)

AWS Systems Manager Agent (SSM Agent) lets you remotely and securely manage on-premises servers and virtual machines (VMs) in your hybrid environment. It's Amazon software that runs on your Amazon Elastic Compute Cloud (Amazon EC2) instances and your hybrid instances that are configured for AWS Systems Manager (hybrid instances). The SSM Agent processes requests from the AWS Systems Manager service in the cloud and configures your machine as specified in the request. The SSM Agent then sends status and execution information back to the AWS Systems Manager service.

AWS IoT

📘

NOTE

AWS IoT is used for Tetra Data Hubs only. Tetra Hubs don't use AWS IoT. Instead, Tetra Hubs uses Amazon ECS and AWS Systems Manager.

AWS IoT enables internet-connected devices to connect to the AWS Cloud and lets applications in the cloud interact with internet-connected devices.

The Tetra Data Hub uses an X.509 certificate to connect to AWS IoT using Transport Layer Security (TLS) mutual authentication protocols. Other AWS services don't support certificate-based authentication, but they can be called by using AWS credentials in AWS Signature Version 4 format. The Signature Version 4 algorithm normally requires the caller to have an access key ID and a secret access key. AWS IoT has a credentials provider that allows you to use the built-in X.509 certificate as the unique device identity to authenticate AWS requests. This eliminates the need to store an access key ID and a secret access key on your device.

The credentials provider authenticates a caller using an X.509 certificate and issues a temporary, limited-privilege security token. The token can be used to sign and authenticate any AWS request. This way of authenticating your AWS requests requires you to create and configure an AWS Identity and Access Management (IAM) role and attach appropriate IAM policies to the role so that the credentials provider can assume the role on your behalf.

The following diagram illustrates the credentials provider workflow.

1760

Credentials provider workflow

When the Tetra Data Hub is activated after installation, an IoT X.509 certificate, with an organization-specific policy is downloaded to the Data Hub machine. Temporary credentials are requested every 30 minutes and are valid for 1 hour. The IoT certificate created for each Data Hub can be revoked from the Tetra Data Platform if necessary.

The following procedure summarizes the steps used to securely retrieve temporary credentials so that the Tetra Data Hub can communicate directly with a specific set of AWS resources and services for your organization:

  1. The AWS IoT device makes an HTTPS request to the credentials provider for a security token.
    The request includes the device X.509 certificate for authentication.
  2. The credentials provider forwards the request to the AWS IoT authentication and authorization
    module to validate the certificate and verify that it has permission to request the security token.
  3. If the certificate is valid and has permission to request a security token, the AWS IoT authentication and authorization module returns success. Otherwise, it sends an exception to the device.
  4. After successfully validating the certificate, the credentials provider invokes the AWS Security Token Service (AWS STS) to assume the IAM role that you created for it.
  5. AWS STS returns a temporary, limited-privilege security token to the credentials provider.
  6. The credentials provider returns the security token to the device.
  7. The device uses the security token to sign an AWS request with AWS Signature Version 4.
  8. The requested service invokes IAM to validate the signature and authorize the request against access policies attached to the IAM role that you created for the credentials provider.
  9. If IAM validates the signature successfully and authorizes the request, the request succeeds. Otherwise, IAM sends an exception.