Re: Column Filtering in Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
Subject: Re: Column Filtering in Logical Replication
Date: 2022-03-12 04:30:14
Message-ID: CAA4eK1KUpOnFam+2bBuGGX-_pZbmO_ShJV6G=GT-RyM7gQSvWg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 11, 2022 at 6:20 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 3/11/22 10:52, Amit Kapila wrote:
> > On Fri, Mar 11, 2022 at 7:26 AM Tomas Vondra
> > <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >>
> >> On 3/10/22 20:10, Tomas Vondra wrote:
> >>>
> >>>
> >>> FWIW I think the reason is pretty simple - pgoutput_row_filter_init is
> >>> broken. It assumes you can just do this
> >>>
> >>> rftuple = SearchSysCache2(PUBLICATIONRELMAP,
> >>> ObjectIdGetDatum(entry->publish_as_relid),
> >>> ObjectIdGetDatum(pub->oid));
> >>>
> >>> if (HeapTupleIsValid(rftuple))
> >>> {
> >>> /* Null indicates no filter. */
> >>> rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> >>> Anum_pg_publication_rel_prqual,
> >>> &pub_no_filter);
> >>> }
> >>> else
> >>> {
> >>> pub_no_filter = true;
> >>> }
> >>>
> >>>
> >>> and pub_no_filter=true means there's no filter at all. Which is
> >>> nonsense, because we're using publish_as_relid here - the publication
> >>> may not include this particular ancestor, in which case we need to just
> >>> ignore this publication.
> >>>
> >>> So yeah, this needs to be reworked.
> >>>
> >>
> >> I spent a bit of time looking at this, and I think a minor change in
> >> get_rel_sync_entry() fixes this - it's enough to ensure rel_publications
> >> only includes publications that actually include publish_as_relid.
> >>
> >
> > Thanks for looking into this. I think in the first patch before
> > calling get_partition_ancestors() we need to ensure it is a partition
> > (the call expects that) and pubviaroot is true.
>
> Does the call really require that?
>

There may not be any harm but I have mentioned it because (a) the
comments atop get_partition_ancestors(...it should only be called when
it is known that the relation is a partition.) indicates the same; (b)
all existing callers seems to use it only for partitions.

> Also, I'm not sure why we'd need to
> look at pubviaroot - that's already considered earlier when calculating
> publish_as_relid, here we just need to know the relationship of the two
> OIDs (if one is ancestor/child of the other).
>

I thought of avoiding calling get_partition_ancestors when pubviaroot
is not set. It will unnecessary check the whole hierarchy for
partitions even when it is not required. I agree that this is not a
common code path but still felt why do it needlessly?

> > I think it would be
> > good if we can avoid an additional call to get_partition_ancestors()
> > as it could be costly.
>
> Maybe. OTOH we only should do this only very rarely anyway.
>
> > I wonder why it is not sufficient to ensure
> > that publish_as_relid exists after ancestor in ancestors list before
> > assigning the ancestor to publish_as_relid? This only needs to be done
> > in case of (if (!publish)). I have not tried this so I could be wrong.
> >
>
> I'm not sure what exactly are you proposing. Maybe try coding it? That's
> probably faster than trying to describe what the code might do ...
>

Okay, please find attached. I have done basic testing of this, if we
agree with this approach then this will require some more testing.

--
With Regards,
Amit Kapila.

Attachment Content-Type Size
v1-0001-fixup-publish_as_relid.patch application/octet-stream 6.1 KB
v1-0002-fixup-row-filter-publications.patch application/octet-stream 1.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2022-03-12 05:16:19 Re: Handle infinite recursion in logical replication setup
Previous Message Alvaro Herrera 2022-03-12 03:53:47 Re: support for MERGE