Re: Binary support for pgoutput plugin

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Dave Cramer <davecramer(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Binary support for pgoutput plugin
Date: 2020-07-20 19:02:59
Message-ID: 641297.1595271779@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Petr Jelinek <petr(at)2ndquadrant(dot)com> writes:
> On 20/07/2020 17:51, Tom Lane wrote:
>> I'm fixing that, but even after that, there's a semantic problem:
>> LOGICALREP_COLUMN_UNCHANGED is just a weak optimization, cf the code
>> that sends it, in proto.c around line 480. colstatus will often *not*
>> be that for columns that were in fact not updated on the remote side.
>> I wonder whether we need to take steps to improve that.

> LOGICALREP_COLUMN_UNCHANGED is not trying to optimize anything, there is
> certainly no effort made to not send columns that were not updated by
> logical replication itself. It's just something we invented in order to
> handle the fact that values for TOASTed columns that were not updated
> are simply not visible to logical decoding (unless table has REPLICA
> IDENTITY FULL) as they are not written to WAL nor accessible via
> historic snapshot. So the output plugin simply does not see the real value.

Hm. So the comment I added a couple days ago is wrong; can you propose
a better one?

However, be that as it may, we do have a provision in the protocol that
can handle marking columns unchanged. I'm thinking if we tried a bit
harder to identify unchanged columns on the sending side, we could both
fix this semantic deficiency for triggers and improve efficiency by
reducing transmission of unneeded data.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-07-20 19:36:06 Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
Previous Message Petr Jelinek 2020-07-20 18:55:07 Re: Binary support for pgoutput plugin