Re: Report replica identity in pg_publication_tables

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Report replica identity in pg_publication_tables
Date: 2025-06-30 12:08:44
Message-ID: CAExHW5tbhntRpwM7Zz3CfTippVovJ_ReBt2DGyFPE8pOsDsTjw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 30, 2025 at 5:17 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Jun 30, 2025 at 3:44 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > Hi All,
> >
> > The commit message in the patch says it all, but let me repeat it here.
> >
>
> You forgot to attach the patch.

Sorry. Here it is
>
> > When debugging issues with logical replication, replica identity
> > property of tables in publication is often useful, for example, to
> > determine the amount of data logged for an UPDATE or DELETE on a given
> > table.
> >
>
> I think it can help to determine what is logged for the DELETE or
> UPDATE operation, but not the exact amount of data. Can you please
> explain with an example how such information can help with debugging?

No. The change itself won't tell the amount of data that will be
logged. But given a publication it will tell what all tables being
published by that publication are using replica identity full - which
causes more columns to be logged compared to replica identity default
or index.

>
> > Given a set of publications that a WAL sender is using,
> > pg_publication_tables can be used to get the list of tables whose
> > changes will be replicated including the columns of those tables and
> > row
> > filters. But the replica identity of those tables needs to be
> > separately found out by querying pg_class or joining pg_class with
> > pg_publication_tables on schemaname and relname. Adding the replica
> > identity column to pg_publication_tables avoids this extra step.
> >
> > The replica identity of a given table is not a property of
> > publication, per say, so it's arguable whether it should be included
> > in pg_publication_tables or not.
> >
>
> Right, discussing the use case a bit more might help us to find if
> this is the right place to add 'replica identity' information.

In our case, we had multiple replication slots, each with a different
publication. One of the slots was lagging because of the amount of
data being sent through the slot. For that we wanted to know which
tables are being published through the corresponding publication and
what's the replica identity of each of the tables.
pg_publication_tables gave us the list of tables but in order to get
its replica identity we needed to join it with pg_class again.
pg_publication_tables already joins pg_class. Exposing replica
identity through pg_publication_tables makes it more convenient to get
all the information related to a tables replication through that
publication in a single line without much code change or run time
cost.

--
Best Wishes,
Ashutosh Bapat

Attachment Content-Type Size
0001-Report-replica-identity-property-of-tables--20250630.patch text/x-patch 17.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2025-06-30 12:11:30 RE: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages
Previous Message Antonin Houska 2025-06-30 12:03:38 Re: A concurrent VACUUM FULL?