Re: Column Filtering in Logical Replication

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Column Filtering in Logical Replication
Date: 2021-07-13 14:43:44
Message-ID: CAH2L28sY8031_udZBf6nTJwY01KPnSdPvWrPYWsE4c2=2RN8CQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tomas,

Thank you for your comments.

>
> >
> > Currently, this capability is not included in the patch. If the table on
> > the subscriber
> > server has lesser attributes than that on the publisher server, it
> > throws an error at the
> > time of CREATE SUBSCRIPTION.
> >
>
> That's a bit surprising, to be honest. I do understand the patch simply
> treats the filtered columns as "unchanged" because that's the simplest
> way to filter the *data* of the columns. But if someone told me we can
> "filter columns" I'd expect this to work without the columns on the
> subscriber.
>
> OK, I will look into adding this.

>
> > However, need to carefully consider situations in which a server
> > subscribes to multiple
> > publications, each publishing a different subset of columns of a table.

Isn't that pretty much the same situation as for multiple subscriptions
> each with a different set of I/U/D operations? IIRC we simply merge
> those, so why not to do the same thing here and merge the attributes?
>
>
Yeah, I agree with the solution to merge the attributes, similar to how
operations are merged. My concern was also from an implementation point
of view, will it be a very drastic change. I now had a look at how remote
relation
attributes are acquired for comparison with local attributes at the
subscriber.
It seems that the publisher will need to send the information about the
filtered columns
for each publication specified during CREATE SUBSCRIPTION.
This will be read at the subscriber side which in turn updates its cache
accordingly.
Currently, the subscriber expects all attributes of a published relation to
be present.
I will add code for this in the next version of the patch.

To nitpick, I find "Bitmapset *att_list" a bit annoying, because it's

not really a list ;-)

I will make this change with the next version

> FWIW "make check" fails for me with this version, due to segfault in
> OpenTableLists. Apparenly there's some confusion - the code expects the
> list to contain PublicationTable nodes, and tries to extract the
> RangeVar from the elements. But the list actually contains RangeVar, so
> this crashes and burns. See the attached backtrace.
>
>
Thank you for the report, This is fixed in the attached version, now all
publication
function calls accept the PublicationTableInfo list.

Thank you,
Rahila Syed

Attachment Content-Type Size
v2-0001-Add-column-filtering-to-logical-replication.patch application/octet-stream 24.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2021-07-13 14:59:47 Re: Detecting File Damage & Inconsistencies
Previous Message vignesh C 2021-07-13 14:30:19 Re: Enhanced error message to include hint messages for redundant options error