Logical Replication Custom Column Expression

From: Stavros Koureas <koureasstavros(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Logical Replication Custom Column Expression
Date: 2022-11-18 13:26:25
Message-ID: CA+O1jk6bpKcgc9HcjJtEgS6Cq=KfZobGF42GPeQ-ZCNP4uXRsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

Working with PostgreSQL Logical Replication is just great! It helps a lot
doing real time replication for analytical purposes without using any other
3d party service. Although all these years working as product architect of
reporting i have noted a few requirements which are always a challenge and
may help enhance logical replication even better.

To the point:
PostgreSQL14 Logical Replication allows replication of a table to another
table that exists in another database or even in another host. It also
allows multiple upstream tables using the same structure to downstream into
a single table.
*CREATE PUBLICATION pb_test FOR TABLE test*

PostgreSQL15 Logical Replication allows even better replication options,
like selecting subsets of the columns from publisher tables. It also
supports plenty of options like disable_on_error etc.
*CREATE PUBLICATION pb_test FOR TABLE test ("id", "name")*

What does not support is the option for defining custom column expressions,
as keys or values, into the upstream (publication). This will give more
flexibility into making replication from multiple upstreams into less
downstreams adding more logic. For instance, in a project for analytical
purposes there is the need to consolidate data from multiple databases into
one and at the same time keep the origin of each replicated data
identified by a tenanant_id column. In this case we also need the ability
to define the new column as an additional key which will participate into
the destination table.

Tenant 1 table
id serial pk
description varchar

Tenant 2 table
id integer pk
description varchar

Group table
tenant integer pk
id integer pk
description varchar

Possible syntax to archive that
*CREATE PUBLICATION pb_test FOR TABLE test ({value:datatype:iskey:alias}
,"id", "name")*

Example
*CREATE PUBLICATION pb_test FOR TABLE test ({1:integer:true:tenant} ,"id",
"name")*

I suppose the column definition should exist in the publication syntax as
the publication should know from before the datatype and if is a key before
being consumed by a subscriber which may already have the column.

So making an insert or update or delete statement into a tenant 1 database:
INSERT INTO test (id, description) VALUES (5, 'data')
UPDATE test SET description = 'data' WHERE id = 5
DELETE FROM test WHERE id = 5
Will be reflected into subscriber as the following
INSERT INTO test (tenant, id, description) VALUES (1, 5, 'data')
UPDATE test SET description = 'data' WHERE tenant=1 AND id = 5
DELETE FROM test WHERE tenant=1 AND id = 5

For more clarifications please reach me at koureasstavros(at)gmail(dot)com
Thanks!

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2022-11-18 13:50:47 Re: Fix proposal for comparaison bugs in PostgreSQL::Version
Previous Message Masahiko Sawada 2022-11-18 13:20:10 Re: [PoC] Improve dead tuple storage for lazy vacuum