URL Whitelisting

The DataHub uses several URLs split into the following three categories:

  • Category 1: AWS endpoints used by the Data Hub and AWS SSM agent
  • Category 2: Endpoints used by connectors running in DataHub
  • Category 3: Endpoints used by the DataHub installer

Category 1

AWS endpoints used by the Data Hub and AWS SSM agent. These are endpoints used by AWS SSM agent daemon installed on the Data Hub machine. All of them should be visible through standard HTTPS port (443):

ssm.[region].amazonaws.com
logs.[region].amazonaws.com
monitoring.[region].amazonaws.com
ssmmessages.[region].amazonaws.com
ec2messages.[region].amazonaws.com
s3.[region].amazonaws.com
sns.[region].amazonaws.com
sqs.[region].amazonaws.com

Data Hub needs to refresh AWS credentials periodically because credentials expire after one hour. Therefore, any AWS IoT credentials endpoints visible to the Data Hub machine should also be visible on standard HTTPS port (443):

  • *.credentials.iot.[region].amazonaws.com

  • [region] - is AWS region where the TetraScience stack is deployed. TetraScience uses us-east-1 for multi-tenant deployment.

📘

NOTE:

The value * at the beginning of the URL is the general value. The actual name of the machine where the endpoint is deployed depends on the AWS account and the region used for platform deployment. If the actual name is required, please refer to this document to learn how to obtain FQDN of credentials endpoint using AWS CLI: https://docs.aws.amazon.com/cli/latest/reference/iot/describe-endpoint.html

Finally, the Data Hub also needs access to ECR endpoints to pull docker images:

ecr.us-east-1.amazonaws.com
api.ecr.us-east-1.amazonaws.com
753968983172.dkr.ecr.us-east-1.amazonaws.com

Category 2

Endpoints used by connectors. Depending on what connectors are planned for the Data Hub, another set of URLs should be open for the Data Hub machine.

  • GDC connector relies on AWS infrastructure and no additional URL is needed.
  • Cellario connector needs access to Cellario URLs to which connector should connect to and poll for new data. Cellario software may be part of an internal network, and if so, then no additional configuration is needed. If it is deployed externally, then those URLs must be whitelisted.
  • SDC connector needs access to SDC URLs to which connector should connect to and poll for new data. Similar to Cellario, SDC is usually deployed in an internal network, but can be in an external network. For an external network, the target SDC URLs must be whitelisted.

Category 3

Data Hub installation script (software pre-requisites). These URLs are needed only if the related software is not pre-installed on the machine, and are required at the time of the Data Hub installation and activation:

URL Whitelisting for Agents

Endpoints used by agents if you would like to set up Agent in S3 Direct Upload model without DataHub. These Tetra Agents use the following AWS endpoints:

AWS EndpointDescriptionWhen Required
[infrastructure name]-[environment]-datalake.s3.[region].amazonaws.com


[infrastructure name]-[environment]-backup.s3.[region].amazonaws.com

Self-hosting customers can find these bucket names in their S3 console. Tetra-hosted customers will receive these urls from TetraScience.
Uploads filesWhen the Enable S3 Direct Upload option is selected
sqs.[region].amazonaws.comFetches the command message and then returns the command processing statusWhen the Enable Queue option is selected
logs.[region].amazonaws.comPosts Agent Heart Beats and Agent logsWhen the Enable S3 Direct Upload option is selected
monitoring.[region].amazonaws.comSends Metrics Data (such as CPU, Memory, and Disk Usage)When the Enable S3 Direct Upload option is selected