data consolidation: logical replication design considerations

From: Rory Campbell-Lange <rory(at)campbell-lange(dot)net>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: data consolidation: logical replication design considerations
Date: 2022-07-16 16:07:09
Message-ID: YtLiLaDdvhQk9ZzB@campbell-lange.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I'd be grateful for some comments on the advisability of using a large
number of concurrent logical replication publications/subscriptions.
Below I've set out the current environment and a suggested design.
Apologies for the length of this email.

We presently have many hundreds of small databases in a cluster, in
separate databases due to historical issues, security considerations and
the ability to scale horizontally in future by using more than one
cluster. The cluster is presently less than half a terabyte in size and
runs very comfortably on a 96GB RAM/32 core Intel E5-2620 server on NVMe
disks in RAID10 configuration on Linux.

The individual databases cover a handful of discrete services (although
they have some common data structures) and are of different sizes
depending on client needs. The largest client database is currently
about 7.5GB in size.

We presently use streaming replication locally and remotely to replicate
the cluster and it has pg_bouncer in front of it. Some settings:

max_connections: 500
shared_buffers: 20GB
work_mem: 15MB

Due to changing client and operational requirements we need to aggregate
some common data between client databases in the same organisation into
single read-only databases for reporting purposes. This new requirement
is in addition to keeping the client databases in operation as they are
now. The potential for using logical replication comes to mind,
specifically the use case of "Consolidating multiple databases into a
single one" mentioned at
https://www.postgresql.org/docs/current/logical-replication.html

Some tests suggest that we can meet the requirements for publication
table replica identity and safe aggregation of data.

At an overview level this consolidation might require the setup of
logical replication publications from say 500 client databases
aggregating in close to real time to 50 target or aggregation
subscribing databases, averaging roughly 10 client database per
aggregation database, but with some aggregation databases having roughly
50 clients.

I would be grateful for comments on the following design proposals:

* to avoid overloading the existing primary host with many replication
slots, it would be wise to implement aggregation on another host

* the new aggregation host should receive streaming replication data
from the primary on a first postgresql instance which will also have
logical replication publishers on each relevant client database. As
noted above, there may be ~500 publications

* the new aggregation host would have a second postgresql instance
serving the aggregation databases each acting as logical
replication subscribers. As noted above, there would be ~50 target
databases each with an average of ~10 subscriptions to the first
postgresql instance.

* that only one subscription per client database is needed (after
initial synchronisation) to synchronise all tables in a particular
client database schema

* that publications and subscriptions are brought online on a per-client
database basis, to reduce the number of replication slots required due
to initial synchronisation (the docs aren't clear about how many
temporary replication slots may be needed "for the initial data
synchronisation of pre-existing table data"; see
https://www.postgresql.org/docs/current/logical-replication-subscription.html)

* that a similar server to the one noted above can handle two postgresql
instances as described together with ~250 concurrent client
connections to the second instance to serve client reporting needs.

Thoughts gratefully received,
Rory

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rick Otten 2022-07-17 20:39:15 Re: data consolidation: logical replication design considerations
Previous Message Justin Pryzby 2022-07-12 15:43:10 Re: Occasional performance issue after changing table partitions