Logs, Metrics, and Alerts
TetraScience uses Amazon CloudWatch to log data from all Tetra Data Platform (TDP) components, including the following:
- Amazon Elastic Container Service (Amazon ECS) containers
- AWS Lambda functions
- Tetra Hub events
- Audit Trail logs
Infrastructure Metrics and Alarms
The TDP's AWS CloudFormation stacks contain custom CloudWatch alarms that are configured to trigger automatic actions. For example, the Amazon ECS service having a high CPU usage for more than five minutes will create another container instance automatically so that it can take up some of the load. Other alarms, like a service restarting unusually fast, send alert emails to the email address that you configure for notifications during deployment. Copies of these emails are also sent to the TetraScience support team.
NOTE
When you view CloudWatch alarms in the AWS Management Console, make sure that the Hide Auto Scaling alarms option is always selected. For instructions, see Hiding Auto Scaling alarms in the AWS documentation. Some alarms will be shown in an
INSUFFICIENT
state. This is normal and no action is needed.
Diagnostic Pipelines
TetraScience uses internal diagnostic pipelines to continuously test all of the infrastructure that’s required by Tetra Data Pipelines and publish failure notifications. Diagnostic pipelines use synthetic data that runs in a dedicated, internal-only TDP organization to periodically verify platform components run as expected, and to ensure infrastructure availability. No customer data is uploaded or stored within this organization.
NOTE
Internal diagnostic pipeline data is used when diagnostic pipelines run only and is automatically deleted after 30 days. By cleaning up diagnostic pipeline data on a regular cadence, TetraScience can reduce cloud storage requirements and cost.
Alerts Sent to TetraScience
The platform definition in CloudFormation includes an Amazon Simple Notification Service (Amazon SNS) topic that sends high-priority, infrastructure-level alerts to the TetraScience team. These alerts contain no sensitive information, just an indication of which component failed and an error code. This information helps the TetraScience team provide timely and effective support.
NOTE
To submit a support request, see Submit a Support Ticket.
Updated 10 months ago