Deployment (Archived)

❗️

DEPRECATED PAGE

This page is deprecated. The new deployment pages are here:

Deployment

Deployment and related activities should be performed by an engineer with good AWS knowledge and full administrator access for the destination AWS account and Disaster Recovery AWS account, if applicable. Permissions should include IAM roles and policies creation and deletion.

Planning for Deployment

Before performing the actual installation, single-tenant customers should consult with TetraScience and decide which application features and components will be in scope:

  • DisasterRecovery (Y/N)- If enabled, it will automatically replicate data and backups to a different AWS account and region.
  • Existing VPC / Create VPC - Our stack can automatically create a new VPC and related networking items. If deploying into an existing VPC is desired, networking tasks like creating subnets and routing will be the customer's responsibility.
  • Public / Private Endpoint - Will the application be exposed to the Internet or not?
  • DNS entries - Should DNS entries be automatically created in AWS Route53 during deployment, or will the customer manage DNS separately?
  • Webserver Certificate - Should a HTTPS certificate be created automatically during deployment, or will the customer supply one? Only 1024 and 2048 bits RSA certificates are supported.
  • Enable Anylink service (Y/N)
  • Enable Egnyte Integration (Y/N)
  • Enable Box Integration (Y/N)
  • Sizing - Based on customer's usage estimations, TetraScience will advise on the value of sizing parameters used at deployment.
  • EKS worker nodes AMI - The default option, which we strongly recommend, is to use AWS provided EKS optimized images. However, it is also possible to use an AMI provided by the client, which should be 100% compatible with the AWS provided one. The client assumes all risks resulting from running a custom image, which can cause instability in operation and also unusual errors and delays in deployment.

Cloudformation Parameters

Below is the full list of parameters that have to be entered at deployment time:

  • Data Layer:
ParameterDefault ValueDetails
CFTemplateBucketts-platform-artifactsPrefix of the S3 bucket where artifacts are stored. Do not change default.
CFTemplateVersionMust match the version of the ServiceCatalog product being installed
InfrastructureNameCustomer specific. All encompassing name for the created infrastructure. Used as a root for naming. Validate with TetraScience.
EnvironmentproductionUsed internally by TetraScience. Do not change default.
IAMRolePrefixOptional string for prefixing all created IAM roles. Leave empty if not used.
IAMBoundaryPolicyARN for a boundary policy that will be attached to all created roles.
Leave empty if not used.
EnableDRfalseSet to true if Disaster Recovery should be implemented
DRAWSAccountIdID of the AWS account used for Disaster Recovery. Leave empty if EnableDR is false.
DRDatalakeKMSKeyARN of KMS key used to encrypt data in DR. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true.
DRDatalakeBucketName of Datalake bucket for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true
DRStreamBucketName of Stream bucket for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true
DRBackupBucketName of Backup bucket for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true
DRLocalArtifactsBucketName of artifacts bucket used for Disaster Recovery. Leave empty if EnableDR is false. See below Disaster Recovery section if EnableDR is true
EnableElasticsearchtrueDo not change default.
EnableLoggingfalseSet to false. The parameter is deprecated and will be removed in the next release.
EsMasterInstanceTypet3.medium.elasticsearchEC2 instance type for Master ElasticSearch. Validate value with TetraScience.
EsDatanodeInstanceTypem4.large.elasticsearchEC2 instance type for DataStore ElasticSearch. Validate value with TetraScience.
EsDatanodeInstanceCount2Number of EC2 instances in the cluster. Validate value with TetraScience.
EsDatanodeVolumeSize100EBS Volume size in GB for Elasticsearch. Validate value with TetraScience.
EsBackupInterval6How frequently (hours) to backup ElasticSearch to S3.
InstanceTypeRDSdb.t2.mediumEC2 instance type for the Postgres database. Default value should be enough in most cases.
RDSBackupInterval24How often to backup the database (in hours).
RDSBackupSchedule0 1 * ? Backup schedule in Cloudwatch Event cron format. Default at 1 AM UTC everyday
RDSBackupRetentionDays
30
RDSBackupRetentionDays30Number of days to keep DB snapshots before deleting them. There is a limit of 100 snapshots per database.
RDSSnapShotLeave empty for a standard install. To be used only when recovering from an actual disaster.
CreateVPCtrueIf true, it will create a new VPC for the application, together with subnets, security groups, NAT gateways.
VpcCIDRNetwork block to use for VPC. If CreateVPC is false, it should match the exiting VPC to be used. For example 10.200.0.0/16.
VPCIDID of the existing VPC. Leave empty if CreateVPC is true.
PublicSubnetIdsComma delimited list of subnet IDs. Leave empty if CreateVPC is true.
PrivateSubnetIdsComma delimited list of subnet IDs. Leave empty if CreateVPC is true.
IsolatedSubnetIdsComma delimited list of subnet IDs that will be used for Windows workers. Leave unchanged if CreateVPC is true
LogsEndpointFQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them.
MonitoringEndpointFQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them.
SqsEndpointFQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them.
CloudformationEndpointFQDN of endpoint used for windows workers. Use a VPC endpoint if using isolated subnets for them.
NotificationEmailEmail address that will be subscribed to alerts via SNS. Should be a group email, to be able to easily add/remove participants.
SourceNotificationEmailWill be used in the "From" field of pipeline notification emails sent. Needs to be verified with SES.
LogRetentionDays90Days for log retention in Cloudwatch
LambdaPrefixLeave Empty. Used internally by Tetrascience.
STBucketLeave empty in a normal installation. Used only for DR recovery
DLBucketLeave empty in a normal installation. Used only for DR recovery
  • Service Layer:
ParameterDefault ValueDetails
CFTemplateVersionv1.0.0Must match the version of the ServiceCatalog product being installed
BranchmasterECR repo suffix. Do not change default.
DataStackName of the Data Layer main stack . Can be obtained from the CloudFormation interface.
EnableLoggingfalseSet to true if the ES Logging cluster in DataLayer was created.
ClusterTypeFargateDo not change default.
InstanceTypeECSt2.largeLegacy. No longer used.
Domain name used by the web UI.
MinCapacityMinimum number of ECS containers for . Set to 0 if is not used.
MaxCapacityMaximum number of ECS containers that can scale to, in case of load. Set to 0 if is not used.
ConnectorMaxMemory2048Memory limit for docker containers running on the datahub machines.
TaskThroughput20Number of files that can be processed in parallel.
EnableWinTaskScriptServicetrueEnable Windows EC2 based workers
WindowsInstanceTypet3.mediumInstance type for Windows workers.
PublicDomainDomain name used by the web UI. It does not have to be exposed on the internet, can be company internal.
ExposedOnInternetfalseSet to true if the application should be accessible from Internet
NoDNSWebfalseSet to true if public DNS records are NOT to be created.
PublicDomainZoneIdPublic Domain Route53 Zone Id. If left empty, a public DNS hosted zone will be created, unless NoDNSWeb is set to true.
CertificateARN of TLS/SSL Certificate registered with ACM. See details in the Pre Deployment Tasks section. If empty, it will try to automatically create a certificate via ACM and the deployment will wait for DNS certificate validation, unless NoDNSWeb is set to true, in which case will disable HTTPS and deploy using unencrypted HTTP.
Certificate validation requires a value for PublicDomainZoneId with the zone containing NS entries for the domain.
PrivateDomaints-dip.internalUsed for ECS inter-service communication. It can be changed to any name, but the default should work just fine.
MinCapacity2Minimum number of ECS containers for . Set to 0 if is not used.
MaxCapacity4Max number of ECS containers to scale out to, in case of heavy load.
LambdaPrefixLeave empty. Used internally by Tetrascience.
AthenaCreateIamUserfalseEnables IAM user creation for Athena access at org creation.

Leaving false will restrict service permissions so that IAM users cannot be created from the platform at runtime.
UserAuditLogGroupSuffixuser-action-audit-logLegacy. Do not change the default value.

Service Parameters and Secrets in SSM

Containers running in ECS need runtime parameters. These parameters may contain sensitive data, such as OAuth tokens, so they are stored encrypted, using a specialized AWS service for secrets management, SSM Parameter Store. The parameters are not shared with TetraScience, so single-tenant customers will have to create them following this procedure.

ParameterDetailsNeeded only if
/tetrascience/production/ECS/ts-service-link-file/BOX_CLIENT_IDBOX Oauth 2.0 custom app Client ID. See below for detailsBOX Integration is enabled
/tetrascience/production/ECS/ts-service-web/INT_BOX_CLIENT_IDSame value as aboveBOX Integration is enabled
/tetrascience/uat/ECS/ts-service-link-file/BOX_CLIENT_SECRETBOX Oauth 2.0 custom app secret.BOX Integration is enabled
/tetrascience/uat/ECS/ts-service-web/INT_EGNYTE_CLIENT_IDEgnyte Client iDEgnyte Integration is enabled

Pre Deployment Tasks

  1. Deployer Privileges
    Confirm the user performing the deployment has Full Administrator Rights, via the AWS managed IAM policy AdministratorAccess or equivalent. Anything less will likely cause the deployment to fail, requiring manual cleanup and causing lengthy delays. The application components will run with minimal privileges and administrator access is required only for deployment and upgrade sessions.
  2. AWS CloudTrail
    Confirm AWS CloudTrail is configured to save events in a S3 bucket.
  3. DHCP Options
    Make sure the VPC's DHCP optionset contains an entry domain-name-servers = AmazonProvidedDNS (only if CreateVPC parameter in DataLayer is set to false).
    In case site policies dictate that client internal, non-AWS DNS servers must be used, a manual workaround can be applied:
    a) . Get the RDS endpoint from the data layer outputs and inject it into all ECS containers via SSM parameter store and ultimately the POSTGRES_HOST environment variable.
    b). Deploy the Service Layer following the normal procedure.
    c). Create a zone in the client's DNS for ts-dip.internal and delegate authority for the zone to AWS DNS servers; check Route 53 to get the server names.
  4. VPC and networking infrastructure (only if CreateVPC parameter in DataLayer is set to false):
    The deployment VPC needs to provide for the platform's exclusive use:
    at least 2 (preferably 3) private /24 or larger subnets in different AZs
    at least 2 (preferably 3) public /28 or larger subnets in different AZs (used only for NAT Gateways and not required if Internet traffic will flow via the corporate network)
    All AWS Service Endpoints must be reachable from all the VPC subnets; VPC Endpoints may be required. The AWS account should have at lest 3 available Elastic IP addresses, if the platform will be accessed from the Internet.
  5. Log Policy
    The ElasticSerach application logs need to be sent to CloudWatch. To allow this, the following AWS CLI command has to be run as an administrator, against the deployment AWS account and Region:
aws logs put-resource-policy --policy-name es2cloudwatch --policy-document '{ "Version": "2012-10-17", "Statement": [{ "Sid": "eslogs", "Effect": "Allow", "Principal": { "Service": "es.amazonaws.com"}, "Action":[ "logs:PutLogEvents"," logs:PutLogEventsBatch","logs:CreateLogStream"],"Resource": "arn:aws:logs:*:*:*:*"}]}'  --region <Region>
  1. AWS Service-Linked Roles
    Service-linked IAM roles must be created, if not already present in the destination AWS account.
    Below are the CLI commands required:
  • ECS Service:
aws iam create-service-linked-role --aws-service-name ecs.amazonaws.com
  • ElasticSearch Service:
aws iam create-service-linked-role --aws-service-name es.amazonaws.com
  1. EC2 KeyPair
    The keypair will be used to allow admin access to EKS worker nodes. Follow the AWS Documentation.
  2. TLS Web Certificate
    The application can generate its own certificate using AWS ACM, but that requires the DNS domain it will use to be hosted in Route53 in the destination AWS account. If that is not the case, or if automatic generation is not wanted, the customer must obtain or self generate a TLS RSA certificate of 1024 or 2048 bits key length and import it in AWS ACM using this procedure. The certificate should cover both the future domain and its api. subdomain. The certificate ARN will be used as input for the Service Layer.
  3. Configure AWS SES (Simple Email Service)
    The platform uses AWS SES to send out notification emails like pipeline result status. The sender email address needs to be a valid email address that is validated with SES using this procedure. Also, a support ticket needs to be raised with AWS to take SES out of Sandbox mode, as documented here.
  4. DisasterRecovery - PreInstall Component (optional)
    The optional Disaster Recovery component requires another AWS account (DR account) besides the main account where the product will be installed. A small CloudFormation stack (dr.yml) provided by TetraScience will have to be deployed under the DR account in the AWS DR Region, which should be different than the main region. The stack requires these parameters:
ParameterDefault ValueDetails
InfrastructureNameCustomer specific. All encompassing name for the created infrastructure. Used as a root for naming. Validate with TetraScience. Same value has to be used in the main product
EnvironmentdrDo not change
ProdAWSAccountIdAWS account number where the main product will be installed

After deployment the stack will generate 3 output values which will be used as parameters for the Data Layer.
11. Box.com integration (optional)

  • Login to your Box account
  • From the left side menu choose "Dev Console"
  • Click "Create New App", choose "Custom App" and click "Next"
  • Select "Standard Oauth 2.0 (User Authentication)"
  • Choose an appropriate name for the new app and click "Create App"
  • Click on "View Your App"
  • From "OAuth 2.0 Credentials" copy "Client ID" and "Client Secret" values
  1. Send Details to TetraScience
    TetraScience needs to receive the following data before sharing the ServiceCatalog product:
  • AWS Account ID where the product will be installed
  • AWS Region where the product will be installed
  • IAM username or role of the administrator who will perform the installation
    The above should be sent for each environment, if the client requires multiple installations (test and prod, for instance). From a technical point of view, TetraScience will treat each of these installations as a separate production client.
  1. Import AWS Service Catalog Portfolio
    Log into the AWS account and region where the deployment will be performed as the administrator who will perform the installation. Navigate to Administration, and from under Portfolios select the Imported tab. From Actions select import portfolio and enter the code received from TetraScience. From the portfolio list, select the recently imported portfolio and then the Users,Goups,and Roles tab. Add the the list the IAM account of the admin user previously shared with TetraScience.

Performing the Deployment

  1. Data Layer
    From the AWS Service Catalog web interface select the data layer product from the Products list. Select launch product and the latest version from the list of available ones. Select a suitable name and click next. Fill in the parameters, consulting the table above. Keep clicking Next until you reach the Review stage. Double check the parameter values and if satisfied, click Launch. The deployment has started. It takes around two and up to three hours, depending on the parameters and AWS backend load.
  2. Service Layer
    Service Layer can be installed only after a successful Data Layer installation. The procedure is similar to the one for Data Layer.

Post Deployment Tasks

  1. Alert Email Subscription Confirmation
    Alert emails will be sent via AWS SNS to the address configured during the deployment of Data Layer. SNS requires the subscription to be confirmed, and sends and email with subject "AWS Notification - Subscription Confirmation". The link in that email must be clicked in order for notifications to work.
  2. Disaster Recovery for Database
    If Disaster Recovery is in scope, another small CloudFormation stack named snapshots_tool_rds_dest.json must be installed in the DR AWS account, in the same AWS Region as the main deployment. The stack takes the following parameters:
ParameterDefaultDetails
CodeBucketDEFAULT_BUCKETDo not change. Where to get lambda code from.
CrossAccountCopyTRUEDo not change.
DeleteOldSnapshotsTRUENo reason to keep snapshots in this region, since they are stored and managed in the DR Region.
DestinationRegionDisaster Recovery AWS region. For instance us-east-2.
KmsKeyDestinationARN of the KMS key in the destination DR region. Enter the value of DRRDSKMSKey output of the DR stack installed during pre deployment.
KMSKeySourceARN of KMS key in the main AWS account and region used to encrypt RDS snapshots. Can be obtained from the AWS KMS interface; the key alias is ts-rds-production
LambdaCWLogRetention7Number of day to retain lambda function logs in CloudWatch
LogLevelINFOLog verbosity for functions
RetentionDays7How many days to keep a snapshot
SnapshotPatternts-platform.*What snapshots to include. Do not change.
SourceRegionOverrideNODo not change.
  1. EKS Endpoint Access Control
    The AWS EKS endpoint is by default exposed to the Internet, posing a security risk. To mitigate this, the EKS cluster endpoint can be configured to work in Private mode, using this procedure. It is currently not possible to make the endpoint private from within CloudFormation templates. Once the option is made available by AWS, TetraScience will include it in the product and this manual step will no longer be required.

  2. ElastciSearch HTTPS Enforcement
    Elasticsearch, by default, also allows plain HTTP connections. To allow only HTTPS, run the following command from a terminal:

aws es update-elasticsearch-domain-config --domain-name <domain_name> --domain-endpoint-options EnforceHTTPS=true
  1. Disabling [email protected] user
    [email protected] user is created by default and has access to all organizations in the setup.
    You may want to disable this user due to security concerns. To do this:
  • Login with [email protected]

  • Switch to TetraScience org

  • Find [email protected] in the list of users

  • Disable the user

  • You will be logged out automatically and won't be able to login with that user back

    Operation is irreversible from the portal. The only way to enable [email protected] back is to update user status in database directly.

  1. Generate secure credentials for organization
    Each organization uses unique IAM roles, policies, and KMS keys. On a fresh installation or when a new organization is created, we need to generate these components.
  • Go to Accounts > Manage Organizations.
  • On TetraScience (current) org, click AWS button.
  1. Use S3 gateway VPC endpoint
    We highly recommend enabling S3 VPC Gateway Endpoint for s3 which will reduce the data transfer cost between VPC and S3. Instructions.

Single Sign-on (Optional)

To enable SSO for a deployment, setup an AWS Cognito Userpool and connect it with your identity provider.

Setting up Cognito

When setting up your Cognito user pool ensure the following:

  • Go to General Settings > Attributes. Ensure email, given name and family name are checked, add a custom attribute named groups with a max length set to 2048, mutable checked.
  • Go to General Settings > App clients. Click Show details then Set attribute read write permissions and under readable attributes check email, family name, given name and custom:groups.
  • Go to App integration > App client settings. Set the callback and logout urls to the deployment domain. Set {domain}/login/sso as the callback, and {domain}/logout as the logout URL. Under Allowed OAuth Scopes check email, openid and profile.
  • Go to Federation > Attribute mapping. Map your identity provider group membership attribute to your custom:groups attribute. Map given and family names also.

Setting up the platform

Gather the following variables and set them in AWS Systems Manager Parameter Store.

Attribute nameWhere to find itParam store location
SSO_DOMAINCognito > App Integration > Domain name/tetrascience/{environment}/ECS/ts-service-user-org/SSO_DOMAIN

/tetrascience/{environment}/ECS/ts-service-web/SSO_DOMAIN
SSO_CLIENT_IDCognito > General Settings > App Clients > App client Id/tetrascience/{environment}/ECS/ts-service-user-org/SSO_CLIENT_ID

/tetrascience/{environment}/ECS/ts-service-web/SSO_CLIENT_ID
SSO_REDIRECT_URICognito > App Integration > App Client Settings > Callback URL/tetrascience/{environment}/ECS/ts-service-user-org/SSO_REDIRECT_URI
SSO_CLIENT_SECRETCognito > General Settings > App Clients > App client Secret/tetrascience/{environment}/ECS/ts-service-user-org/SSO_CLIENT_SECRET

Set as SecureString
SSO_GROUPS_ATTRIBUTECognito > General Settings > Attributes > Custom Attributes/tetrascience/{environment}/ECS/ts-service-user-org/SSO_GROUPS_ATTRIBUTE

After setting the variables into Parameter Store, restart ts-service-user-org and ts-service-web.

SSO login will become available at {domain}/login/sso.

Setting up your Organization

Login as an Organization or System Admin account. Navigate to Account > Organization.

Click the Single sign-on button. In the modal, for each org role (admin, member, readonly) fill in the group membership for each role. For example, if all users who belong to a group named admins group should be in the org admin role, then in the input box under "admin" enter the value "admin group". Save.

Repeat for each role you wish to map to an SSO identity provider group.