TetraScience also provides features and services to pipeline your data into Data Warehouse and SQL Data Base solutions. Some of the most common ones include AWS Redshift, AWS RDS and AWS Aurora. This is optional and not provided out of the box. If you need such capability, please let your Account Manager and Solution Architect know!
AWS Aurora RDS
AWS MySQL/PostgreSQL RDS
In contrast to an OLTP database, an OLAP database is designed to process large datasets quickly to answer questions about data. The name reflects this purpose: Online Analytic Processing. Common use cases for an OLAP database are:
An OLAP database is optimized for scanning and performing computations across many rows of data for one or multiple columns. To improve performance, OLAP databases are designed to to be columnar. Instead of organizing data as rows, the underlying data in an OLAP database is organized column by column. Columnar based storage allows for better compression and easier sequential reads, features necessary for scanning large amounts of data quickly.
Other popular OLAP databases include
OLTP is what most people thinks of databases. It stands for Online Transactional Processing and is designed to serve as a persistent state store for front-end applications. They excel at quickly looking up specific information as well as transactional procedures like INSERT, UPDATE, or DELETE. Some common tasks asked of OLTP systems include situations like:
These type of problems require a system that can look up and update one or more columns within one or many rows. The strength of OLTPs is that they support fast writes. A typical workload for OLTP is both frequent reads and writes, but the reads tend to be more of looking up a specific value rather than scanning all values to compute an aggregate. Common OLTP systems are:
Similar to AWS Aurora
Our recommendation is that you start with the less opinionated and cost-effective solution such as AWS RDS MySQL and RDS PostgreSQL. This allows you to use pretty much most of the common tools and frameworks available and just hit the ground running.
After you become more familiar and confident of your business need and the types of operations you would like to perform, you can then transition to AWS RedShift or AWS Aurora. At that point, you will be able to better understand the type of queries and aggregations you would like to perform and can optimize your use cases by designing the right table in AWS Redshift.
TetraScience delivery team will provide AWS Glue ETL pipelines to move the latest data from S3 to your data warehouse. We will handle edge cases such as when the data is re-processed (scientists may recalculate the peaks on the chromatogram) and your data warehouse needs to contain the latest changes. Note that this will incur an extra cost based on workload and traffic.
For those of you who would like to use Oracle databases, we recommend you consider AWS Oracle RDS. Here are some answers to your common questions.
Q: Does Oracle RDS provides license?
A: You will need to bring your own Oracle license in order to use the Oracle Enterprise Edition. This is the Bring Your Own License (BYOL) model. However, there is also a License Included model. For more information, please refer to this document.
Q: Can the data be encrypted using Oracle's Transparent Data Encryption?
A: Yes. This is compatible with AWS RDS Oracle.
Q: Does RDS oracle support Spring Data JPA.
A: Yes. This is compatible with AWS RDS Oracle. Reference:
Using Amazon Web Services Relational Database Service for your Spring Boot App
Updated over 2 years ago