Re: Logical Replication Custom Column Expression

From: Stavros Koureas <koureasstavros(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Logical Replication Custom Column Expression
Date: 2022-11-22 12:52:34
Message-ID: CA+O1jk5aB0mBiJteoaQgR=SsezmBqscFb7FU8EjbY+nJ0jbzSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sure, this can be implemented as a subscription option, and it will cover
this use case scenario as each subscriber points only to one database.
I also have some more analytical/reporting use-cases which need additions
in logical-replication, I am not sure if I need to open
different discussions for each one, all ideas are for
publication/subscription.

Στις Τρί 22 Νοε 2022 στις 2:22 μ.μ., ο/η Amit Kapila <
amit(dot)kapila16(at)gmail(dot)com> έγραψε:

> On Mon, Nov 21, 2022 at 5:05 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > On Sat, Nov 19, 2022 at 6:47 PM Stavros Koureas
> > <koureasstavros(at)gmail(dot)com> wrote:
> > >
> > > What does not support is the option for defining custom column
> expressions, as keys or values, into the upstream (publication). This will
> give more flexibility into making replication from multiple upstreams into
> less downstreams adding more logic. For instance, in a project for
> analytical purposes there is the need to consolidate data from multiple
> databases into one and at the same time keep the origin of each replicated
> data identified by a tenanant_id column. In this case we also need the
> ability to define the new column as an additional key which will
> participate into the destination table.
> > >
> > > Tenant 1 table
> > > id serial pk
> > > description varchar
> > >
> > > Tenant 2 table
> > > id integer pk
> > > description varchar
> > >
> > > Group table
> > > tenant integer pk
> > > id integer pk
> > > description varchar
> > >
> > > Possible syntax to archive that
> > > CREATE PUBLICATION pb_test FOR TABLE test
> ({value:datatype:iskey:alias} ,"id", "name")
> > >
> > > Example
> > > CREATE PUBLICATION pb_test FOR TABLE test ({1:integer:true:tenant}
> ,"id", "name")
> >
> > I think that's a valid usecase.
> >
> > This looks more like a subscription option to me. In multi-subscriber
> > multi-publisher scenarios, on one subscriber a given upstream may be
> > tenant 1 but on some other it could be 2. But I don't think we allow
> > specifying subscription options for a single table. AFAIU, the origin
> > ids are available as part of the commit record which contained this
> > change; that's how conflict resolution is supposed to know it. So
> > somehow the subscriber will need to fetch those from there and set the
> > tenant.
> >
>
> Yeah, to me also it appears that we can handle it on the subscriber
> side. We have the provision of sending origin information in proto.c.
> But note that by default publishers won't have any origin associated
> with change unless someone has defined it. I think this work needs
> more thought but sounds to be an interesting feature.
>
> --
> With Regards,
> Amit Kapila.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2022-11-22 13:10:01 Re: postgres_fdw binary protocol support
Previous Message houzj.fnst@fujitsu.com 2022-11-22 12:42:24 RE: Perform streaming logical transactions by background workers and parallel apply