Re: Binary support for pgoutput plugin

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Dave Cramer <davecramer(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Binary support for pgoutput plugin
Date: 2020-07-20 18:55:07
Message-ID: 1f3a3b7d-3ea2-630c-1b99-368df3fdecdf@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 20/07/2020 17:51, Tom Lane wrote:
> Peter Geoghegan <pg(at)bowt(dot)ie> writes:
>> Skink's latest run reports a failure that I surmise was caused by this patch:
>
> Yeah, I've just been digging through that. The patch didn't create
> the bug, but it allowed valgrind to detect it, because the column
> status array is now "just big enough" rather than being always
> MaxTupleAttributeNumber entries. To wit, the problem is that the
> code in apply_handle_update that computes target_rte->updatedCols
> is junk.
>
> The immediate issue is that it fails to apply the remote-to-local
> column number mapping, so that it's looking at the wrong colstatus
> entries, possibly including entries past the end of the array.
>
> I'm fixing that, but even after that, there's a semantic problem:
> LOGICALREP_COLUMN_UNCHANGED is just a weak optimization, cf the code
> that sends it, in proto.c around line 480. colstatus will often *not*
> be that for columns that were in fact not updated on the remote side.
> I wonder whether we need to take steps to improve that.
>

LOGICALREP_COLUMN_UNCHANGED is not trying to optimize anything, there is
certainly no effort made to not send columns that were not updated by
logical replication itself. It's just something we invented in order to
handle the fact that values for TOASTed columns that were not updated
are simply not visible to logical decoding (unless table has REPLICA
IDENTITY FULL) as they are not written to WAL nor accessible via
historic snapshot. So the output plugin simply does not see the real value.

--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-07-20 19:02:59 Re: Binary support for pgoutput plugin
Previous Message Andrey M. Borodin 2020-07-20 18:46:47 Re: Amcheck: do rightlink verification with lock coupling