Question on overall design

From: veem v <veema0000(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Question on overall design
Date: 2023-12-09 11:13:33
Message-ID: CAB+=1TVomkmnGKB-7=5RW5P3EL9CM8JFFpGJJpodB7v+_ZPWjw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello All,
Although it's not exactly related to opensource postgre but want to ask
this question here to understand colleagues' view, considering having
decades of experience in the database world, We want some guidance, if the
below design looks okay for our customer use case.

We currently have financial systems transaction data streams to Oracle
exadata(X9) on-premise. This database supports processing of 400million
transactions per day. A single transaction for us is a combination of 7-8
inserts into different transaction tables with Indexes , unique constraints
etc defined on those. The transactions processed/committed in batches(~1000
batch size) in the database. And this system persists data for ~6 months.
We do have all sorts of OLAP(daily/monthly batch reports running)
applications run on the same database along with some user facing UI
applications showing customer transactions. So it's basically currently
serving a hybrid workload and is one stop solution for all use cases.

Many of the applications are moving from on premise to AWS cloud as part of
modernization journey and AWS being chosen cloud partner also the product
is expected to expand across more regions and this system is expected to
serve increase in the transaction volume. And also we have a requirement to
persist transaction data for ~10years to have those available for
analytics/data science use cases.

So the team is thinking of splitting it into two parts
1)OLTP type use case in which we will persist/write the transaction data
faster and show it to the UI related apps , in near real time/quickest
possible time. and this database will store Max 60-90 days of transaction
data. Not sure if we have an option of Oracle exadata equivalent on AWS, so
team planning of using/experimenting with Aurora postgres. Please correct
me, if there are any other options we should use otherwise?

2)Then move the data beyond ~90 days into another database or object
storage S3 which will keep it there for ~10 years and will be queryable
using the necessary API's. That is supposed to cater to Olap/analytics/data
science use cases etc.

Is the above design is okay? and also in regards to the second point above
i.e. persisting the historical data (that to be in queryable state), should
we go for some database like snowflake or should just keep it on S3 as is
and make those queryable through APIs. Please advice?

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Hannu Krosing 2023-12-09 11:56:23 Re: Emitting JSON to file using COPY TO
Previous Message Laurenz Albe 2023-12-09 06:01:38 Re: how can I fix my accent issues?