Re: Column Filtering in Logical Replication

From: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Column Filtering in Logical Replication
Date: 2021-07-19 10:20:32
Message-ID: CALtqXTeKbzT16ouuiKCDxWM8PWGaSpKnU3MM_5CCyXKGkW-bkQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 13, 2021 at 7:44 PM Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:

>
> Hi Tomas,
>
> Thank you for your comments.
>
>
>>
>> >
>> > Currently, this capability is not included in the patch. If the table on
>> > the subscriber
>> > server has lesser attributes than that on the publisher server, it
>> > throws an error at the
>> > time of CREATE SUBSCRIPTION.
>> >
>>
>> That's a bit surprising, to be honest. I do understand the patch simply
>> treats the filtered columns as "unchanged" because that's the simplest
>> way to filter the *data* of the columns. But if someone told me we can
>> "filter columns" I'd expect this to work without the columns on the
>> subscriber.
>>
>> OK, I will look into adding this.
>
>
>>
>> > However, need to carefully consider situations in which a server
>> > subscribes to multiple
>> > publications, each publishing a different subset of columns of a
>> table.
>
> Isn't that pretty much the same situation as for multiple subscriptions
>> each with a different set of I/U/D operations? IIRC we simply merge
>> those, so why not to do the same thing here and merge the attributes?
>>
>>
> Yeah, I agree with the solution to merge the attributes, similar to how
> operations are merged. My concern was also from an implementation point
> of view, will it be a very drastic change. I now had a look at how remote
> relation
> attributes are acquired for comparison with local attributes at the
> subscriber.
> It seems that the publisher will need to send the information about the
> filtered columns
> for each publication specified during CREATE SUBSCRIPTION.
> This will be read at the subscriber side which in turn updates its cache
> accordingly.
> Currently, the subscriber expects all attributes of a published relation
> to be present.
> I will add code for this in the next version of the patch.
>
> To nitpick, I find "Bitmapset *att_list" a bit annoying, because it's
>
> not really a list ;-)
>
>
> I will make this change with the next version
>
>
>
>> FWIW "make check" fails for me with this version, due to segfault in
>> OpenTableLists. Apparenly there's some confusion - the code expects the
>> list to contain PublicationTable nodes, and tries to extract the
>> RangeVar from the elements. But the list actually contains RangeVar, so
>> this crashes and burns. See the attached backtrace.
>>
>>
> Thank you for the report, This is fixed in the attached version, now all
> publication
> function calls accept the PublicationTableInfo list.
>
> Thank you,
> Rahila Syed
>
>
>

The patch does not apply, and an rebase is required

Hunk #8 succeeded at 1259 (offset 99 lines).
Hunk #9 succeeded at 1360 (offset 99 lines).
1 out of 9 hunks FAILED -- saving rejects to file
src/backend/replication/pgoutput/pgoutput.c.rej
patching file src/include/catalog/pg_publication.h

Changing the status to "Waiting on Author"

--
Ibrar Ahmed

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ibrar Ahmed 2021-07-19 10:34:56 Re: shared-memory based stats collector
Previous Message Ibrar Ahmed 2021-07-19 10:17:43 Re: Re[3]: On login trigger: take three