Quick Links

Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed

From:	Alexander Lakhin <exclusion(at)gmail(dot)com>
To:	Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	"Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed
Date:	2024-01-12 11:00:01
Message-ID:	cc7925b8-30cc-c76d-b1b6-c9ec6bd36a03@gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

12.01.2024 10:15, Bertrand Drouvot wrote:
>
> For this one, the "good" news is that it looks like that we don’t see the
> "terminating" message not followed by an "obsolete" message (so the engine
> behaves correctly) anymore.
>
> There is simply nothing related to the row_removal_activeslot at all (the catalog_xmin
> advanced and there is no conflict).

Yes, judging from all the failures that we see now, it looks like the
0001-Fix-race-condition...patch works as expected.

> And I agree that this is due to the Standby/RUNNING_XACTS that is "advancing" the
> catalog_xmin of the active slot.
>
>> Standby/RUNNING_XACTS is exactly why 039_end_of_wal.pl uses wal_level
>> = minimal, because these lead to unpredictible records inserted,
>> impacting the reliability of the tests. We cannot do that here,
>> obviously. That may be a long shot, but could it be possible to tweak
>> the test with a retry logic, retrying things if such a standby
>> snapshot is found because we know that the invalidation is not going
>> to work anyway?
> I think it all depends what the xl_running_xacts does contain (means does it
> "advance" or not the catalog_xmin in our case).
>
> In our case it does advance it (should it occurs) due to the "select txid_current()"
> that is done in wait_until_vacuum_can_remove() in 035_standby_logical_decoding.pl.
>
> I suggest to make use of txid_current_snapshot() instead (that does not produce
> a Transaction/COMMIT wal record, as opposed to txid_current()).
>
> I think that it could be "enough" for our case here, and it's what v5 attached is
> now doing.
>
> Let's give v5 a try? (please apply v1-0001-Fix-race-condition-in-InvalidatePossiblyObsoleteS.patch
> too).

Unfortunately, I've got the failure again (please see logs attached).
(_primary.log can confirm that I have used exactly v5 — I see no
txid_current() calls there...)

Best regards,
Alexander

Attachment	Content-Type	Size
035-failures-vacuum-pg_authid.tar.gz	application/gzip	150.7 KB

In response to

Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed at 2024-01-12 07:15:44 from Bertrand Drouvot

Responses

Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed at 2024-01-12 13:46:08 from Bertrand Drouvot

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2024-01-12 11:16:37	Re: Make attstattarget nullable
Previous Message	Michael Banck	2024-01-12 10:54:29	Re: plpgsql memory leaks