Deployment (Archived)
DEPRECATED PAGE
This page is deprecated. The new deployment pages are here:
- Overview (starting point for the deployment doc): https://developers.tetrascience.com/docs/single-tenant-overview.
- Requirements: https://developers.tetrascience.com/docs/requirements-for-deploying-the-tetra-data-platform
- Parameters: https://developers.tetrascience.com/docs/deployment-parameters-single-tenant
- Deployment: https://developers.tetrascience.com/docs/deployment-single-tenant
- Post Deployment: https://developers.tetrascience.com/docs/post-deployment-single-tenant
- Deployment Common Issues: https://dash.readme.com/project/data-integration-platform/v1.0/docs/installation-troubleshooting-single-tenant
- Security: https://dash.readme.com/project/data-integration-platform/v1.0/docs/security-and-aws-iam
- Required AWS Services: https://dash.readme.com/project/data-integration-platform/v1.0/docs/required-aws-services
- VPC Endpoints: https://dash.readme.com/project/data-integration-platform/v1.0/docs/vpc-endpoints
Deployment
Deployment and related activities should be performed by an engineer with good AWS knowledge and full administrator access for the destination AWS account and Disaster Recovery AWS account, if applicable. Permissions should include IAM roles and policies creation and deletion.
Planning for Deployment
Before performing the actual installation, single-tenant customers should consult with TetraScience and decide which application features and components will be in scope:
- DisasterRecovery (Y/N)- If enabled, it will automatically replicate data and backups to a different AWS account and region.
- Existing VPC / Create VPC - Our stack can automatically create a new VPC and related networking items. If deploying into an existing VPC is desired, networking tasks like creating subnets and routing will be the customer's responsibility.
- Public / Private Endpoint - Will the application be exposed to the Internet or not?
- DNS entries - Should DNS entries be automatically created in AWS Route53 during deployment, or will the customer manage DNS separately?
- Webserver Certificate - Should a HTTPS certificate be created automatically during deployment, or will the customer supply one? Only 1024 and 2048 bits RSA certificates are supported.
- Enable Anylink service (Y/N)
- Enable Egnyte Integration (Y/N)
- Enable Box Integration (Y/N)
- Sizing - Based on customer's usage estimations, TetraScience will advise on the value of sizing parameters used at deployment.
- EKS worker nodes AMI - The default option, which we strongly recommend, is to use AWS provided EKS optimized images. However, it is also possible to use an AMI provided by the client, which should be 100% compatible with the AWS provided one. The client assumes all risks resulting from running a custom image, which can cause instability in operation and also unusual errors and delays in deployment.
Cloudformation Parameters
Below is the full list of parameters that have to be entered at deployment time:
- Data Layer:
Parameter | Default Value | Details |
---|---|---|
CFTemplateBucket | ts-platform-artifacts | Prefix of the S3 bucket where artifacts are stored. Do not change default. |
CFTemplateVersion | Must match the version of the ServiceCatalog product being installed | |
InfrastructureName | Customer specific. All encompassing name for the created infrastructure. Used as a root for naming. Validate with TetraScience. | |
Environment | production | Used internally by TetraScience. Do not change default. |
IAMRolePrefix | Optional string for prefixing all created IAM roles. Leave empty if not used. | |
IAMBoundaryPolicy | ARN for a boundary policy that will be attached to all created roles. Leave empty if not used. | |
EnableDR | false | Set to true if Disaster Recovery should be implemented |
DRAWSAccountId | ID of the AWS account used for Disaster Recovery. Leave empty if EnableDR is false. | |
DRDatalakeKMSKey | ARN of KMS key used to encrypt data in DR. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true. | |
DRDatalakeBucket | Name of Datalake bucket for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true | |
DRStreamBucket | Name of Stream bucket for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true | |
DRBackupBucket | Name of Backup bucket for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true | |
DRLocalArtifactsBucket | Name of artifacts bucket used for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true | |
EnableElasticsearch | true | Do not change default. |
EnableLogging | false | Set to false. The parameter is deprecated and will be removed in the next release. |
EsMasterInstanceType | t3.medium.elasticsearch | EC2 instance type for Master ElasticSearch. Validate value with TetraScience. |
EsDatanodeInstanceType | m4.large.elasticsearch | EC2 instance type for DataStore ElasticSearch. Validate value with TetraScience. |
EsDatanodeInstanceCount | 2 | Number of EC2 instances in the cluster. Validate value with TetraScience. |
EsDatanodeVolumeSize | 100 | EBS Volume size in GB for Elasticsearch. Validate value with TetraScience. |
EsBackupInterval | 6 | How frequently (hours) to backup ElasticSearch to S3. |
InstanceTypeRDS | db.t2.medium | EC2 instance type for the Postgres database. Default value should be enough in most cases. |
RDSBackupInterval | 24 | How often to backup the database (in hours). |
RDSBackupSchedule | 0 1 * ? | Backup schedule in Cloudwatch Event cron format. Default at 1 AM UTC everyday RDSBackupRetentionDays 30 |
RDSBackupRetentionDays | 30 | Number of days to keep DB snapshots before deleting them. There is a limit of 100 snapshots per database. |
RDSSnapShot | Leave empty for a standard install. To be used only when recovering from an actual disaster. | |
CreateVPC | true | If true, it will create a new VPC for the application, together with subnets, security groups, NAT gateways. |
VpcCIDR | Network block to use for VPC. If CreateVPC is false, it should match the exiting VPC to be used. For example 10.200.0.0/16. | |
VPCID | ID of the existing VPC. Leave empty if CreateVPC is true. | |
PublicSubnetIds | Comma delimited list of subnet IDs. Leave empty if CreateVPC is true. | |
PrivateSubnetIds | Comma delimited list of subnet IDs. Leave empty if CreateVPC is true. | |
IsolatedSubnetIds | Comma delimited list of subnet IDs that will be used for Windows workers. Leave unchanged if CreateVPC is true | |
LogsEndpoint | FQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them. | |
MonitoringEndpoint | FQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them. | |
SqsEndpoint | FQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them. | |
CloudformationEndpoint | FQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them. | |
NotificationEmail | Email address that will be subscribed to alerts via SNS. Should be a group email, to be able to easily add/remove participants. | |
SourceNotificationEmail | Will be used in the "From" field of pipeline notification emails sent. Needs to be verified with SES. | |
LogRetentionDays | 90 | Days for log retention in Cloudwatch |
LambdaPrefix | Leave Empty. Used internally by Tetrascience. | |
STBucket | Leave empty in a normal installation. Used only for DR recovery | |
DLBucket | Leave empty in a normal installation. Used only for DR recovery |
- Service Layer:
Parameter | Default Value | Details |
---|---|---|
CFTemplateVersion | v1.0.0 | Must match the version of the ServiceCatalog product being installed |
Branch | master | ECR repo suffix. Do not change default. |
DataStack | Name of the Data Layer main stack . Can be obtained from the CloudFormation interface. | |
EnableLogging | false | Set to true if the ES Logging cluster in DataLayer was created. |
ClusterType | Fargate | Do not change default. |
InstanceTypeECS | t2.large | Legacy. No longer used. |
Domain name used by the web UI. | ||
MinCapacity | Minimum number of ECS containers for . Set to 0 if is not used. | |
MaxCapacity | Maximum number of ECS containers that can scale to, in case of load. Set to 0 if is not used. | |
ConnectorMaxMemory | 2048 | Memory limit for docker containers running on the datahub machines. |
TaskThroughput | 20 | Number of files that can be processed in parallel. |
EnableWinTaskScriptService | true | Enable Windows EC2 based workers |
WindowsInstanceType | t3.medium | Instance type for Windows workers. |
PublicDomain | Domain name used by the web UI. It does not have to be exposed on the internet, can be company internal. | |
ExposedOnInternet | false | Set to true if the application should be accessible from Internet |
NoDNSWeb | false | Set to true if public DNS records are NOT to be created. |
PublicDomainZoneId | Public Domain Route53 Zone Id. If left empty, a public DNS hosted zone will be created, unless NoDNSWeb is set to true. | |
Certificate | ARN of TLS/SSL Certificate registered with ACM. See details in the Pre Deployment Tasks section. If empty, it will try to automatically create a certificate via ACM and the deployment will wait for DNS certificate validation, unless NoDNSWeb is set to true, in which case will disable HTTPS and deploy using unencrypted HTTP. Certificate validation requires a value for PublicDomainZoneId with the zone containing NS entries for the domain. | |
PrivateDomain | ts-dip.internal | Used for ECS inter-service communication. It can be changed to any name, but the default should work just fine. |
MinCapacity | 2 | Minimum number of ECS containers for . Set to 0 if is not used. |
MaxCapacity | 4 | Max number of ECS containers to scale out to, in case of heavy load. |
LambdaPrefix | Leave empty. Used internally by Tetrascience. | |
AthenaCreateIamUser | false | Enables IAM user creation for Athena access at org creation. Leaving false will restrict service permissions so that IAM users cannot be created from the platform at runtime. |
UserAuditLogGroupSuffix | user-action-audit-log | Legacy. Do not change the default value. |
Service Parameters and Secrets in SSM
Containers running in ECS need runtime parameters. These parameters may contain sensitive data, such as OAuth tokens, so they are stored encrypted, using a specialized AWS service for secrets management, SSM Parameter Store. The parameters are not shared with TetraScience, so single-tenant customers will have to create them following this procedure.
Parameter | Details | Needed only if |
---|---|---|
/tetrascience/production/ECS/ts-service-link-file/BOX_CLIENT_ID | BOX Oauth 2.0 custom app Client ID. See below for details | BOX Integration is enabled |
/tetrascience/production/ECS/ts-service-web/INT_BOX_CLIENT_ID | Same value as above | BOX Integration is enabled |
/tetrascience/uat/ECS/ts-service-link-file/BOX_CLIENT_SECRET | BOX Oauth 2.0 custom app secret. | BOX Integration is enabled |
/tetrascience/uat/ECS/ts-service-web/INT_EGNYTE_CLIENT_ID | Egnyte Client iD | Egnyte Integration is enabled |
Pre Deployment Tasks
- Deployer Privileges
Confirm the user performing the deployment has Full Administrator Rights, via the AWS managed IAM policy AdministratorAccess or equivalent. Anything less will likely cause the deployment to fail, requiring manual cleanup and causing lengthy delays. The application components will run with minimal privileges and administrator access is required only for deployment and upgrade sessions. - AWS CloudTrail
Confirm AWS CloudTrail is configured to save events in a S3 bucket. - DHCP Options
Make sure the VPC's DHCP optionset contains an entry domain-name-servers = AmazonProvidedDNS (only if CreateVPC parameter in DataLayer is set to false).
In case site policies dictate that client internal, non-AWS DNS servers must be used, a manual workaround can be applied:
a) . Get the RDS endpoint from the data layer outputs and inject it into all ECS containers via SSM parameter store and ultimately the POSTGRES_HOST environment variable.
b). Deploy the Service Layer following the normal procedure.
c). Create a zone in the client's DNS for ts-dip.internal and delegate authority for the zone to AWS DNS servers; check Route 53 to get the server names. - VPC and networking infrastructure (only if CreateVPC parameter in DataLayer is set to false):
The deployment VPC needs to provide for the platform's exclusive use:
at least 2 (preferably 3) private /24 or larger subnets in different AZs
at least 2 (preferably 3) public /28 or larger subnets in different AZs (used only for NAT Gateways and not required if Internet traffic will flow via the corporate network)
All AWS Service Endpoints must be reachable from all the VPC subnets; VPC Endpoints may be required. The AWS account should have at lest 3 available Elastic IP addresses, if the platform will be accessed from the Internet. - Log Policy
The ElasticSerach application logs need to be sent to CloudWatch. To allow this, the following AWS CLI command has to be run as an administrator, against the deployment AWS account and Region:
aws logs put-resource-policy --policy-name es2cloudwatch --policy-document '{ "Version": "2012-10-17", "Statement": [{ "Sid": "eslogs", "Effect": "Allow", "Principal": { "Service": "es.amazonaws.com"}, "Action":[ "logs:PutLogEvents"," logs:PutLogEventsBatch","logs:CreateLogStream"],"Resource": "arn:aws:logs:*:*:*:*"}]}' --region <Region>
- AWS Service-Linked Roles
Service-linked IAM roles must be created, if not already present in the destination AWS account.
Below are the CLI commands required:
- ECS Service:
aws iam create-service-linked-role --aws-service-name ecs.amazonaws.com
- ElasticSearch Service:
aws iam create-service-linked-role --aws-service-name es.amazonaws.com
- EC2 KeyPair
The keypair will be used to allow admin access to EKS worker nodes. Follow the AWS Documentation. - TLS Web Certificate
The application can generate its own certificate using AWS ACM, but that requires the DNS domain it will use to be hosted in Route53 in the destination AWS account. If that is not the case, or if automatic generation is not wanted, the customer must obtain or self generate a TLS RSA certificate of 1024 or 2048 bits key length and import it in AWS ACM using this procedure. The certificate should cover both the future domain and its api. subdomain. The certificate ARN will be used as input for the Service Layer. - Configure AWS SES (Simple Email Service)
The platform uses AWS SES to send out notification emails like pipeline result status. The sender email address needs to be a valid email address that is validated with SES using this procedure. Also, a support ticket needs to be raised with AWS to take SES out of Sandbox mode, as documented here. - DisasterRecovery - PreInstall Component (optional)
The optional Disaster Recovery component requires another AWS account (DR account) besides the main account where the product will be installed. A small CloudFormation stack (dr.yml) provided by TetraScience will have to be deployed under the DR account in the AWS DR Region, which should be different than the main region. The stack requires these parameters:
Parameter | Default Value | Details |
---|---|---|
InfrastructureName | Customer specific. All encompassing name for the created infrastructure. Used as a root for naming. Validate with TetraScience. Same value has to be used in the main product | |
Environment | dr | Do not change |
ProdAWSAccountId | AWS account number where the main product will be installed |
After deployment the stack will generate 3 output values which will be used as parameters for the Data Layer.
11. Box.com integration (optional)
- Login to your Box account
- From the left side menu choose "Dev Console"
- Click "Create New App", choose "Custom App" and click "Next"
- Select "Standard Oauth 2.0 (User Authentication)"
- Choose an appropriate name for the new app and click "Create App"
- Click on "View Your App"
- From "OAuth 2.0 Credentials" copy "Client ID" and "Client Secret" values
- Send Details to TetraScience
TetraScience needs to receive the following data before sharing the ServiceCatalog product:
- AWS Account ID where the product will be installed
- AWS Region where the product will be installed
- IAM username or role of the administrator who will perform the installation
The above should be sent for each environment, if the client requires multiple installations (test and prod, for instance). From a technical point of view, TetraScience will treat each of these installations as a separate production client.
- Import AWS Service Catalog Portfolio
Log into the AWS account and region where the deployment will be performed as the administrator who will perform the installation. Navigate to Administration, and from under Portfolios select the Imported tab. From Actions select import portfolio and enter the code received from TetraScience. From the portfolio list, select the recently imported portfolio and then the Users,Goups,and Roles tab. Add the the list the IAM account of the admin user previously shared with TetraScience.
Performing the Deployment
- Data Layer
From the AWS Service Catalog web interface select the data layer product from the Products list. Select launch product and the latest version from the list of available ones. Select a suitable name and click next. Fill in the parameters, consulting the table above. Keep clicking Next until you reach the Review stage. Double check the parameter values and if satisfied, click Launch. The deployment has started. It takes around two and up to three hours, depending on the parameters and AWS backend load. - Service Layer
Service Layer can be installed only after a successful Data Layer installation. The procedure is similar to the one for Data Layer.
Post Deployment Tasks
- Alert Email Subscription Confirmation
Alert emails will be sent via AWS SNS to the address configured during the deployment of Data Layer. SNS requires the subscription to be confirmed, and sends and email with subject "AWS Notification - Subscription Confirmation". The link in that email must be clicked in order for notifications to work. - Disaster Recovery for Database
If Disaster Recovery is in scope, another small CloudFormation stack namedsnapshots_tool_rds_dest.json
must be installed in the DR AWS account, in the same AWS Region as the main deployment. The stack takes the following parameters:
Parameter | Default | Details |
---|---|---|
CodeBucket | DEFAULT_BUCKET | Do not change. Where to get lambda code from. |
CrossAccountCopy | TRUE | Do not change. |
DeleteOldSnapshots | TRUE | No reason to keep snapshots in this region, since they are stored and managed in the DR Region. |
DestinationRegion | Disaster Recovery AWS region. For instance us-east-2. | |
KmsKeyDestination | ARN of the KMS key in the destination DR region. Enter the value of DRRDSKMSKey output of the DR stack installed during pre deployment. | |
KMSKeySource | ARN of KMS key in the main AWS account and region used to encrypt RDS snapshots. Can be obtained from the AWS KMS interface; the key alias is ts-rds-production | |
LambdaCWLogRetention | 7 | Number of day to retain lambda function logs in CloudWatch |
LogLevel | INFO | Log verbosity for functions |
RetentionDays | 7 | How many days to keep a snapshot |
SnapshotPattern | ts-platform.* | What snapshots to include. Do not change. |
SourceRegionOverride | NO | Do not change. |
-
EKS Endpoint Access Control
The AWS EKS endpoint is by default exposed to the Internet, posing a security risk. To mitigate this, the EKS cluster endpoint can be configured to work in Private mode, using this procedure. It is currently not possible to make the endpoint private from within CloudFormation templates. Once the option is made available by AWS, TetraScience will include it in the product and this manual step will no longer be required. -
ElastciSearch HTTPS Enforcement
Elasticsearch, by default, also allows plain HTTP connections. To allow only HTTPS, run the following command from a terminal:
aws es update-elasticsearch-domain-config --domain-name <domain_name> --domain-endpoint-options EnforceHTTPS=true
- Disabling [email protected] user
[email protected] user is created by default and has access to all organizations in the setup.
You may want to disable this user due to security concerns. To do this:
-
Login with [email protected]
-
Switch to TetraScience org
-
Find [email protected] in the list of users
-
Disable the user
-
You will be logged out automatically and won't be able to login with that user back
Operation is irreversible from the portal. The only way to enable [email protected] back is to update user status in database directly.
- Generate secure credentials for organization
Each organization uses unique IAM roles, policies, and KMS keys. On a fresh installation or when a new organization is created, we need to generate these components.
- Go to Accounts > Manage Organizations.
- On TetraScience (current) org, click AWS button.
- Use S3 gateway VPC endpoint
We highly recommend enabling S3 VPC Gateway Endpoint for s3 which will reduce the data transfer cost between VPC and S3. Instructions.
Single Sign-on (Optional)
To enable SSO for a deployment, setup an AWS Cognito Userpool and connect it with your identity provider.
Setting up Cognito
When setting up your Cognito user pool ensure the following:
- Go to General Settings > Attributes. Ensure email, given name and family name are checked, add a custom attribute named groups with a max length set to 2048, mutable checked.
- Go to General Settings > App clients. Click Show details then Set attribute read write permissions and under readable attributes check email, family name, given name and custom:groups.
- Go to App integration > App client settings. Set the callback and logout urls to the deployment domain. Set {domain}/login/sso as the callback, and {domain}/logout as the logout URL. Under Allowed OAuth Scopes check email, openid and profile.
- Go to Federation > Attribute mapping. Map your identity provider group membership attribute to your custom:groups attribute. Map given and family names also.
Setting up the platform
Gather the following variables and set them in AWS Systems Manager Parameter Store.
Attribute name | Where to find it | Param store location |
---|---|---|
SSO_DOMAIN | Cognito > App Integration > Domain name | /tetrascience/{environment}/ECS/ts-service-user-org/SSO_DOMAIN /tetrascience/{environment}/ECS/ts-service-web/SSO_DOMAIN |
SSO_CLIENT_ID | Cognito > General Settings > App Clients > App client Id | /tetrascience/{environment}/ECS/ts-service-user-org/SSO_CLIENT_ID /tetrascience/{environment}/ECS/ts-service-web/SSO_CLIENT_ID |
SSO_REDIRECT_URI | Cognito > App Integration > App Client Settings > Callback URL | /tetrascience/{environment}/ECS/ts-service-user-org/SSO_REDIRECT_URI |
SSO_CLIENT_SECRET | Cognito > General Settings > App Clients > App client Secret | /tetrascience/{environment}/ECS/ts-service-user-org/SSO_CLIENT_SECRET Set as SecureString |
SSO_GROUPS_ATTRIBUTE | Cognito > General Settings > Attributes > Custom Attributes | /tetrascience/{environment}/ECS/ts-service-user-org/SSO_GROUPS_ATTRIBUTE |
After setting the variables into Parameter Store, restart ts-service-user-org and ts-service-web.
SSO login will become available at {domain}/login/sso.
Setting up your Organization
Login as an Organization or System Admin account. Navigate to Account > Organization.
Click the Single sign-on button. In the modal, for each org role (admin, member, readonly) fill in the group membership for each role. For example, if all users who belong to a group named admins group should be in the org admin role, then in the input box under "admin" enter the value "admin group". Save.
Repeat for each role you wish to map to an SSO identity provider group.
Updated 12 months ago