Re: Column Filtering in Logical Replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
Subject: Re: Column Filtering in Logical Replication
Date: 2022-03-11 09:52:19
Message-ID: CAA4eK1J9UDnM-xN7viu4eO_DDL_PqMG9F6ZsskqEU7snujXi+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 11, 2022 at 7:26 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 3/10/22 20:10, Tomas Vondra wrote:
> >
> >
> > FWIW I think the reason is pretty simple - pgoutput_row_filter_init is
> > broken. It assumes you can just do this
> >
> > rftuple = SearchSysCache2(PUBLICATIONRELMAP,
> > ObjectIdGetDatum(entry->publish_as_relid),
> > ObjectIdGetDatum(pub->oid));
> >
> > if (HeapTupleIsValid(rftuple))
> > {
> > /* Null indicates no filter. */
> > rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
> > Anum_pg_publication_rel_prqual,
> > &pub_no_filter);
> > }
> > else
> > {
> > pub_no_filter = true;
> > }
> >
> >
> > and pub_no_filter=true means there's no filter at all. Which is
> > nonsense, because we're using publish_as_relid here - the publication
> > may not include this particular ancestor, in which case we need to just
> > ignore this publication.
> >
> > So yeah, this needs to be reworked.
> >
>
> I spent a bit of time looking at this, and I think a minor change in
> get_rel_sync_entry() fixes this - it's enough to ensure rel_publications
> only includes publications that actually include publish_as_relid.
>

Thanks for looking into this. I think in the first patch before
calling get_partition_ancestors() we need to ensure it is a partition
(the call expects that) and pubviaroot is true. I think it would be
good if we can avoid an additional call to get_partition_ancestors()
as it could be costly. I wonder why it is not sufficient to ensure
that publish_as_relid exists after ancestor in ancestors list before
assigning the ancestor to publish_as_relid? This only needs to be done
in case of (if (!publish)). I have not tried this so I could be wrong.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2022-03-11 10:04:28 Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Previous Message Yura Sokolov 2022-03-11 09:34:32 Re: BufferAlloc: don't take two simultaneous locks