Deployment (Archived)

❗️

DEPRECATED PAGE

This page is deprecated. The new deployment pages are here:

Deployment

Deployment and related activities should be performed by an engineer with good AWS knowledge and full administrator access for the destination AWS account and Disaster Recovery AWS account, if applicable. Permissions should include IAM roles and policies creation and deletion.

Planning for Deployment

Before performing the actual installation, single-tenant customers should consult with TetraScience and decide which application features and components will be in scope:

  • DisasterRecovery (Y/N)- If enabled, it will automatically replicate data and backups to a different AWS account and region.
  • Existing VPC / Create VPC - Our stack can automatically create a new VPC and related networking items. If deploying into an existing VPC is desired, networking tasks like creating subnets and routing will be the customer's responsibility.
  • Public / Private Endpoint - Will the application be exposed to the Internet or not?
  • DNS entries - Should DNS entries be automatically created in AWS Route53 during deployment, or will the customer manage DNS separately?
  • Webserver Certificate - Should a HTTPS certificate be created automatically during deployment, or will the customer supply one? Only 1024 and 2048 bits RSA certificates are supported.
  • Enable Anylink service (Y/N)
  • Enable Egnyte Integration (Y/N)
  • Enable Box Integration (Y/N)
  • Sizing - Based on customer's usage estimations, TetraScience will advise on the value of sizing parameters used at deployment.
  • EKS worker nodes AMI - The default option, which we strongly recommend, is to use AWS provided EKS optimized images. However, it is also possible to use an AMI provided by the client, which should be 100% compatible with the AWS provided one. The client assumes all risks resulting from running a custom image, which can cause instability in operation and also unusual errors and delays in deployment.

Cloudformation Parameters

Below is the full list of parameters that have to be entered at deployment time:

  • Data Layer:

Parameter

Default Value

Details

CFTemplateBucket

ts-platform-artifacts

Prefix of the S3 bucket where artifacts are stored. Do not change default.

CFTemplateVersion

Must match the version of the ServiceCatalog product being installed

InfrastructureName

Customer specific. All encompassing name for the created infrastructure. Used as a root for naming. Validate with TetraScience.

Environment

production

Used internally by TetraScience. Do not change default.

IAMRolePrefix

Optional string for prefixing all created IAM roles. Leave empty if not used.

IAMBoundaryPolicy

ARN for a boundary policy that will be attached to all created roles.
Leave empty if not used.

EnableDR

false

Set to true if Disaster Recovery should be implemented

DRAWSAccountId

ID of the AWS account used for Disaster Recovery. Leave empty if EnableDR is false.

DRDatalakeKMSKey

ARN of KMS key used to encrypt data in DR. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true.

DRDatalakeBucket

Name of Datalake bucket for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true

DRStreamBucket

Name of Stream bucket for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true

DRBackupBucket

Name of Backup bucket for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true

DRLocalArtifactsBucket

Name of artifacts bucket used for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true

EnableElasticsearch

true

Do not change default.

EnableLogging

false

Set to false. The parameter is deprecated and will be removed in the next release.

EsMasterInstanceType

t3.medium.elasticsearch

EC2 instance type for Master ElasticSearch. Validate value with TetraScience.

EsDatanodeInstanceType

m4.large.elasticsearch

EC2 instance type for DataStore ElasticSearch. Validate value with TetraScience.

EsDatanodeInstanceCount

2

Number of EC2 instances in the cluster. Validate value with TetraScience.

EsDatanodeVolumeSize

100

EBS Volume size in GB for Elasticsearch. Validate value with TetraScience.

EsBackupInterval

6

How frequently (hours) to backup ElasticSearch to S3.

InstanceTypeRDS

db.t2.medium

EC2 instance type for the Postgres database. Default value should be enough in most cases.

RDSBackupInterval

24

How often to backup the database (in hours).

RDSBackupSchedule

0 1 * ?

Backup schedule in Cloudwatch Event cron format. Default at 1 AM UTC everyday
RDSBackupRetentionDays
30

RDSBackupRetentionDays

30

Number of days to keep DB snapshots before deleting them. There is a limit of 100 snapshots per database.

RDSSnapShot

Leave empty for a standard install. To be used only when recovering from an actual disaster.

CreateVPC

true

If true, it will create a new VPC for the application, together with subnets, security groups, NAT gateways.

VpcCIDR

Network block to use for VPC. If CreateVPC is false, it should match the exiting VPC to be used. For example 10.200.0.0/16.

VPCID

ID of the existing VPC. Leave empty if CreateVPC is true.

PublicSubnetIds

Comma delimited list of subnet IDs. Leave empty if CreateVPC is true.

PrivateSubnetIds

Comma delimited list of subnet IDs. Leave empty if CreateVPC is true.

IsolatedSubnetIds

Comma delimited list of subnet IDs that will be used for Windows workers. Leave unchanged if CreateVPC is true

LogsEndpoint

FQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them.

MonitoringEndpoint

FQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them.

SqsEndpoint

FQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them.

CloudformationEndpoint

FQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them.

NotificationEmail

Email address that will be subscribed to alerts via SNS. Should be a group email, to be able to easily add/remove participants.

SourceNotificationEmail

Will be used in the "From" field of pipeline notification emails sent. Needs to be verified with SES.

LogRetentionDays

90

Days for log retention in Cloudwatch

LambdaPrefix

Leave Empty. Used internally by Tetrascience.

STBucket

Leave empty in a normal installation. Used only for DR recovery

DLBucket

Leave empty in a normal installation. Used only for DR recovery

  • Service Layer:

Parameter

Default Value

Details

CFTemplateVersion

v1.0.0

Must match the version of the ServiceCatalog product being installed

Branch

master

ECR repo suffix. Do not change default.

DataStack

Name of the Data Layer main stack . Can be obtained from the CloudFormation interface.

EnableLogging

false

Set to true if the ES Logging cluster in DataLayer was created.

ClusterType

Fargate

Do not change default.

InstanceTypeECS

t2.large

Legacy. No longer used.

Domain name used by the web UI.

MinCapacity

Minimum number of ECS containers for . Set to 0 if is not used.

MaxCapacity

Maximum number of ECS containers that can scale to, in case of load. Set to 0 if is not used.

ConnectorMaxMemory

2048

Memory limit for docker containers running on the datahub machines.

TaskThroughput

20

Number of files that can be processed in parallel.

EnableWinTaskScriptService

true

Enable Windows EC2 based workers

WindowsInstanceType

t3.medium

Instance type for Windows workers.

PublicDomain

Domain name used by the web UI. It does not have to be exposed on the internet, can be company internal.

ExposedOnInternet

false

Set to true if the application should be accessible from Internet

NoDNSWeb

false

Set to true if public DNS records are NOT to be created.

PublicDomainZoneId

Public Domain Route53 Zone Id. If left empty, a public DNS hosted zone will be created, unless NoDNSWeb is set to true.

Certificate

ARN of TLS/SSL Certificate registered with ACM. See details in the Pre Deployment Tasks section. If empty, it will try to automatically create a certificate via ACM and the deployment will wait for DNS certificate validation, unless NoDNSWeb is set to true, in which case will disable HTTPS and deploy using unencrypted HTTP.
Certificate validation requires a value for PublicDomainZoneId with the zone containing NS entries for the domain.

PrivateDomain

ts-dip.internal

Used for ECS inter-service communication. It can be changed to any name, but the default should work just fine.

MinCapacity

2

Minimum number of ECS containers for . Set to 0 if is not used.

MaxCapacity

4

Max number of ECS containers to scale out to, in case of heavy load.

LambdaPrefix

Leave empty. Used internally by Tetrascience.

AthenaCreateIamUser

false

Enables IAM user creation for Athena access at org creation.

Leaving false will restrict service permissions so that IAM users cannot be created from the platform at runtime.

UserAuditLogGroupSuffix

user-action-audit-log

Legacy. Do not change the default value.

Service Parameters and Secrets in SSM

Containers running in ECS need runtime parameters. These parameters may contain sensitive data, such as OAuth tokens, so they are stored encrypted, using a specialized AWS service for secrets management, SSM Parameter Store. The parameters are not shared with TetraScience, so single-tenant customers will have to create them following this procedure.

Parameter

Details

Needed only if

/tetrascience/production/ECS/ts-service-link-file/BOX_CLIENT_ID

BOX Oauth 2.0 custom app Client ID. See below for details

BOX Integration is enabled

/tetrascience/production/ECS/ts-service-web/INT_BOX_CLIENT_ID

Same value as above

BOX Integration is enabled

/tetrascience/uat/ECS/ts-service-link-file/BOX_CLIENT_SECRET

BOX Oauth 2.0 custom app secret.

BOX Integration is enabled

/tetrascience/uat/ECS/ts-service-web/INT_EGNYTE_CLIENT_ID

Egnyte Client iD

Egnyte Integration is enabled

Pre Deployment Tasks

  1. Deployer Privileges
    Confirm the user performing the deployment has Full Administrator Rights, via the AWS managed IAM policy AdministratorAccess or equivalent. Anything less will likely cause the deployment to fail, requiring manual cleanup and causing lengthy delays. The application components will run with minimal privileges and administrator access is required only for deployment and upgrade sessions.
  2. AWS CloudTrail
    Confirm AWS CloudTrail is configured to save events in a S3 bucket.
  3. DHCP Options
    Make sure the VPC's DHCP optionset contains an entry domain-name-servers = AmazonProvidedDNS (only if CreateVPC parameter in DataLayer is set to false).
    In case site policies dictate that client internal, non-AWS DNS servers must be used, a manual workaround can be applied:
    a) . Get the RDS endpoint from the data layer outputs and inject it into all ECS containers via SSM parameter store and ultimately the POSTGRES_HOST environment variable.
    b). Deploy the Service Layer following the normal procedure.
    c). Create a zone in the client's DNS for ts-dip.internal and delegate authority for the zone to AWS DNS servers; check Route 53 to get the server names.
  4. VPC and networking infrastructure (only if CreateVPC parameter in DataLayer is set to false):
    The deployment VPC needs to provide for the platform's exclusive use:
    at least 2 (preferably 3) private /24 or larger subnets in different AZs
    at least 2 (preferably 3) public /28 or larger subnets in different AZs (used only for NAT Gateways and not required if Internet traffic will flow via the corporate network)
    All AWS Service Endpoints must be reachable from all the VPC subnets; VPC Endpoints may be required. The AWS account should have at lest 3 available Elastic IP addresses, if the platform will be accessed from the Internet.
  5. Log Policy
    The ElasticSerach application logs need to be sent to CloudWatch. To allow this, the following AWS CLI command has to be run as an administrator, against the deployment AWS account and Region:
aws logs put-resource-policy --policy-name es2cloudwatch --policy-document '{ "Version": "2012-10-17", "Statement": [{ "Sid": "eslogs", "Effect": "Allow", "Principal": { "Service": "es.amazonaws.com"}, "Action":[ "logs:PutLogEvents"," logs:PutLogEventsBatch","logs:CreateLogStream"],"Resource": "arn:aws:logs:*:*:*:*"}]}'  --region <Region>
  1. AWS Service-Linked Roles
    Service-linked IAM roles must be created, if not already present in the destination AWS account.
    Below are the CLI commands required:
  • ECS Service:
aws iam create-service-linked-role --aws-service-name ecs.amazonaws.com
  • ElasticSearch Service:
aws iam create-service-linked-role --aws-service-name es.amazonaws.com
  1. EC2 KeyPair
    The keypair will be used to allow admin access to EKS worker nodes. Follow the AWS Documentation.
  2. TLS Web Certificate
    The application can generate its own certificate using AWS ACM, but that requires the DNS domain it will use to be hosted in Route53 in the destination AWS account. If that is not the case, or if automatic generation is not wanted, the customer must obtain or self generate a TLS RSA certificate of 1024 or 2048 bits key length and import it in AWS ACM using this procedure. The certificate should cover both the future domain and its api. subdomain. The certificate ARN will be used as input for the Service Layer.
  3. Configure AWS SES (Simple Email Service)
    The platform uses AWS SES to send out notification emails like pipeline result status. The sender email address needs to be a valid email address that is validated with SES using this procedure. Also, a support ticket needs to be raised with AWS to take SES out of Sandbox mode, as documented here.
  4. DisasterRecovery - PreInstall Component (optional)
    The optional Disaster Recovery component requires another AWS account (DR account) besides the main account where the product will be installed. A small CloudFormation stack (dr.yml) provided by TetraScience will have to be deployed under the DR account in the AWS DR Region, which should be different than the main region. The stack requires these parameters:

Parameter

Default Value

Details

InfrastructureName

Customer specific. All encompassing name for the created infrastructure. Used as a root for naming. Validate with TetraScience. Same value has to be used in the main product

Environment

dr

Do not change

ProdAWSAccountId

AWS account number where the main product will be installed

After deployment the stack will generate 3 output values which will be used as parameters for the Data Layer.

  1. Box.com integration (optional)
  • Login to your Box account
  • From the left side menu choose "Dev Console"
  • Click "Create New App", choose "Custom App" and click "Next"
  • Select "Standard Oauth 2.0 (User Authentication)"
  • Choose an appropriate name for the new app and click "Create App"
  • Click on "View Your App"
  • From "OAuth 2.0 Credentials" copy "Client ID" and "Client Secret" values
  1. Send Details to TetraScience
    TetraScience needs to receive the following data before sharing the ServiceCatalog product:
  • AWS Account ID where the product will be installed
  • AWS Region where the product will be installed
  • IAM username or role of the administrator who will perform the installation
    The above should be sent for each environment, if the client requires multiple installations (test and prod, for instance). From a technical point of view, TetraScience will treat each of these installations as a separate production client.
  1. Import AWS Service Catalog Portfolio
    Log into the AWS account and region where the deployment will be performed as the administrator who will perform the installation. Navigate to Administration, and from under Portfolios select the Imported tab. From Actions select import portfolio and enter the code received from TetraScience. From the portfolio list, select the recently imported portfolio and then the Users,Goups,and Roles tab. Add the the list the IAM account of the admin user previously shared with TetraScience.

Performing the Deployment

  1. Data Layer
    From the AWS Service Catalog web interface select the data layer product from the Products list. Select launch product and the latest version from the list of available ones. Select a suitable name and click next. Fill in the parameters, consulting the table above. Keep clicking Next until you reach the Review stage. Double check the parameter values and if satisfied, click Launch. The deployment has started. It takes around two and up to three hours, depending on the parameters and AWS backend load.
  2. Service Layer
    Service Layer can be installed only after a successful Data Layer installation. The procedure is similar to the one for Data Layer.

Post Deployment Tasks

  1. Alert Email Subscription Confirmation
    Alert emails will be sent via AWS SNS to the address configured during the deployment of Data Layer. SNS requires the subscription to be confirmed, and sends and email with subject "AWS Notification - Subscription Confirmation". The link in that email must be clicked in order for notifications to work.
  2. Disaster Recovery for Database
    If Disaster Recovery is in scope, another small CloudFormation stack named snapshots_tool_rds_dest.json must be installed in the DR AWS account, in the same AWS Region as the main deployment. The stack takes the following parameters:

Parameter

Default

Details

CodeBucket

DEFAULT_BUCKET

Do not change. Where to get lambda code from.

CrossAccountCopy

TRUE

Do not change.

DeleteOldSnapshots

TRUE

No reason to keep snapshots in this region, since they are stored and managed in the DR Region.

DestinationRegion

Disaster Recovery AWS region. For instance us-east-2.

KmsKeyDestination

ARN of the KMS key in the destination DR region. Enter the value of DRRDSKMSKey output of the DR stack installed during pre deployment.

KMSKeySource

ARN of KMS key in the main AWS account and region used to encrypt RDS snapshots. Can be obtained from the AWS KMS interface; the key alias is ts-rds-production

LambdaCWLogRetention

7

Number of day to retain lambda function logs in CloudWatch

LogLevel

INFO

Log verbosity for functions

RetentionDays

7

How many days to keep a snapshot

SnapshotPattern

ts-platform.*

What snapshots to include. Do not change.

SourceRegionOverride

NO

Do not change.

  1. EKS Endpoint Access Control
    The AWS EKS endpoint is by default exposed to the Internet, posing a security risk. To mitigate this, the EKS cluster endpoint can be configured to work in Private mode, using this procedure. It is currently not possible to make the endpoint private from within CloudFormation templates. Once the option is made available by AWS, TetraScience will include it in the product and this manual step will no longer be required.

  2. ElastciSearch HTTPS Enforcement
    Elasticsearch, by default, also allows plain HTTP connections. To allow only HTTPS, run the following command from a terminal:

aws es update-elasticsearch-domain-config --domain-name <domain_name> --domain-endpoint-options EnforceHTTPS=true
  1. Disabling [email protected] user
    [email protected] user is created by default and has access to all organizations in the setup.
    You may want to disable this user due to security concerns. To do this:
  • Login with [email protected]

  • Switch to TetraScience org

  • Find [email protected] in the list of users

  • Disable the user

  • You will be logged out automatically and won't be able to login with that user back

    Operation is irreversible from the portal. The only way to enable [email protected] back is to update user status in database directly.

  1. Generate secure credentials for organization
    Each organization uses unique IAM roles, policies, and KMS keys. On a fresh installation or when a new organization is created, we need to generate these components.
  • Go to Accounts > Manage Organizations.
  • On TetraScience (current) org, click AWS button.
  1. Use S3 gateway VPC endpoint
    We highly recommend enabling S3 VPC Gateway Endpoint for s3 which will reduce the data transfer cost between VPC and S3. Instructions.

Single Sign-on (Optional)

To enable SSO for a deployment, setup an AWS Cognito Userpool and connect it with your identity provider.

Setting up Cognito

When setting up your Cognito user pool ensure the following:

  • Go to General Settings > Attributes. Ensure email, given name and family name are checked, add a custom attribute named groups with a max length set to 2048, mutable checked.
  • Go to General Settings > App clients. Click Show details then Set attribute read write permissions and under readable attributes check email, family name, given name and custom:groups.
  • Go to App integration > App client settings. Set the callback and logout urls to the deployment domain. Set {domain}/login/sso as the callback, and {domain}/logout as the logout URL. Under Allowed OAuth Scopes check email, openid and profile.
  • Go to Federation > Attribute mapping. Map your identity provider group membership attribute to your custom:groups attribute. Map given and family names also.

Setting up the platform

Gather the following variables and set them in AWS Systems Manager Parameter Store.

Attribute name

Where to find it

Param store location

SSO_DOMAIN

Cognito > App Integration > Domain name

/tetrascience/{environment}/ECS/ts-service-user-org/SSO_DOMAIN

/tetrascience/{environment}/ECS/ts-service-web/SSO_DOMAIN

SSO_CLIENT_ID

Cognito > General Settings > App Clients > App client Id

/tetrascience/{environment}/ECS/ts-service-user-org/SSO_CLIENT_ID

/tetrascience/{environment}/ECS/ts-service-web/SSO_CLIENT_ID

SSO_REDIRECT_URI

Cognito > App Integration > App Client Settings > Callback URL

/tetrascience/{environment}/ECS/ts-service-user-org/SSO_REDIRECT_URI

SSO_CLIENT_SECRET

Cognito > General Settings > App Clients > App client Secret

/tetrascience/{environment}/ECS/ts-service-user-org/SSO_CLIENT_SECRET

Set as SecureString

SSO_GROUPS_ATTRIBUTE

Cognito > General Settings > Attributes > Custom Attributes

/tetrascience/{environment}/ECS/ts-service-user-org/SSO_GROUPS_ATTRIBUTE

After setting the variables into Parameter Store, restart ts-service-user-org and ts-service-web.

SSO login will become available at {domain}/login/sso.

Setting up your Organization

Login as an Organization or System Admin account. Navigate to Account > Organization.

Click the Single sign-on button. In the modal, for each org role (admin, member, readonly) fill in the group membership for each role. For example, if all users who belong to a group named admins group should be in the org admin role, then in the input box under "admin" enter the value "admin group". Save.

Repeat for each role you wish to map to an SSO identity provider group.


Did this page help you?