Re: bogus: logical replication rows/cols combinations

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: bogus: logical replication rows/cols combinations
Date: 2022-05-02 17:36:54
Message-ID: da8a10c1-25b6-29b5-19bd-649a9c9ad0be@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/2/22 13:44, Alvaro Herrera wrote:
> On 2022-May-02, Amit Kapila wrote:
>
>> We don't do that currently but we can as mentioned in my previous
>> email [1]. Let me write the relevant part again. We need to expose all
>> publications for a walsender, and then we can find the exact set of
>> publications where the current publication is used with other
>> publications and we can check only those publications. So, if we have
>> three walsenders (walsnd1: pub1, pub2; walsnd2 pub2; walsnd3: pub2,
>> pub3) in the system and we are currently altering publication pub1
>> then we need to check only pub3 for any conflicting conditions.
>
> Hmm ... so what happens in the current system, if you have a running
> walsender and modify the publication concurrently? Will the subscriber
> start getting the changes with the new publication definition, at some
> arbitrary point in the middle of their stream? If that's what we do,
> maybe we should have a signalling system which disconnects all
> walsenders using that publication, so that they can connect and receive
> the new definition.
>
> I don't see anything in the publication DDL that interacts with
> walsenders -- perhaps I'm overlooking something.
>

pgoutput.c is relies on relcache callbacks to get notified of changes.
See the stuff that touches replicate_valid and publications_valid. So
the walsender should notice the changes immediately.

Maybe you have some particular case in mind, though?

>> I think it is possible to expose a list of publications for each
>> walsender as it is stored in each walsenders
>> LogicalDecodingContext->output_plugin_private. AFAIK, each walsender
>> can have one such LogicalDecodingContext and we can probably share it
>> via shared memory?
>
> I guess we need to create a DSM each time a walsender opens a
> connection, at START_REPLICATION time. Then ALTER PUBLICATION needs to
> connect to all DSMs of all running walsenders and see if they are
> reading from it. Is that what you have in mind? Alternatively, we
> could have one DSM per publication with a PID array of all walsenders
> that are sending it (each walsender needs to add its PID as it starts).
> The latter might be better.
>

I don't quite follow what we're trying to build here. The walsender
already knows which publications it works with - how else would
pgoutput.c know that? So the walsender should be able to validate the
stuff it's supposed to replicate is OK.

Why would we need to know publications replicated by other walsenders?
And what if the subscriber is not connected at the moment? In that case
there'll be no walsender.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-05-02 17:36:57 Re: avoid multiple hard links to same WAL file after a crash
Previous Message Bruce Momjian 2022-05-02 17:25:53 Re: Odd LOG output from "postgres -C"