Re: long-standing data loss bug in initial sync of logical replication

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: long-standing data loss bug in initial sync of logical replication
Date: 2024-06-24 14:36:04
Message-ID: 571f7387-edfa-4733-a335-ccce7ce01574@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/24/24 12:54, Amit Kapila wrote:
> ...
>>
>>>> I'm not sure there are any cases where using SRE instead of AE would cause
>>>> problems for logical decoding, but it seems very hard to prove. I'd be very
>>>> surprised if just using SRE would not lead to corrupted cache contents in some
>>>> situations. The cases where a lower lock level is ok are ones where we just
>>>> don't care that the cache is coherent in that moment.
>>
>>> Are you saying it might break cases that are not corrupted now? How
>>> could obtaining a stronger lock have such effect?
>>
>> No, I mean that I don't know if using SRE instead of AE would have negative
>> consequences for logical decoding. I.e. whether, from a logical decoding POV,
>> it'd suffice to increase the lock level to just SRE instead of AE.
>>
>> Since I don't see how it'd be correct otherwise, it's kind of a moot question.
>>
>
> We lost track of this thread and the bug is still open. IIUC, the
> conclusion is to use SRE in OpenTableList() to fix the reported issue.
> Andres, Tomas, please let me know if my understanding is wrong,
> otherwise, let's proceed and fix this issue.
>

It's in the commitfest [https://commitfest.postgresql.org/48/4766/] so I
don't think we 'lost track' of it, but it's true we haven't done much
progress recently.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2024-06-24 14:37:08 Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
Previous Message Tomas Vondra 2024-06-24 14:12:38 basebackups seem to have serious issues with FILE_COPY in CREATE DATABASE