TDP takes full advantage of the fault tolerance build in AWS services. TDP design closely follows AWS best practices for each platform component:
- Datalake files: stored in AWS S3, which has 99.999999999% durability and 99.99% availability per year and is designed to sustain data loss in two facilities
- RDS database: Multi-AZ deployment. The data is permanently and synchronously replicated to a standby instance in abother availability zone. The database will automatically failover to standby in case of an infrastructure problem.
- ElasticSearch: configured by default with 3 master nodes and 2 data nodes in 2 availability zones. An infrastructure failure in one AZ will not impact the cluster.
- ECS services: All important platform services are running at least 2 instances, each in its own availability zone, so an instance failure will not impact the overall platform.
Updated almost 2 years ago