Re: Flaky vacuum truncate test in reloptions.sql

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Arseny Sher <a(dot)sher(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Flaky vacuum truncate test in reloptions.sql
Date: 2021-03-30 07:12:19
Message-ID: YGLOWsO2bV7KLvh7@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 30, 2021 at 01:58:50AM +0300, Arseny Sher wrote:
> Intimate reading of lazy_scan_heap says that the failure indeed might
> happen; if ConditionalLockBufferForCleanup couldn't lock the buffer and
> either the buffer doesn't need freezing or vacuum is not aggressive, we
> don't insist on close inspection of the page contents and count it as
> nonempty according to lazy_check_needs_freeze. It means the page is
> regarded as such even if it contains only garbage (but occupied) ItemIds,
> which is the case of the test. And of course this allegedly nonempty
> page prevents the truncation. Obvious competitors for the page are
> bgwriter/checkpointer; the chances of a simultaneous attack are small
> but they exist.

Yep, this is the same problem as the one discussed for c2dc1a7, where
a concurrent checkpoint may cause a page to be skipped, breaking the
test.

> I'm a bit puzzled that I've ever seen this only when running regression
> tests under our multimaster. While multimaster contains a fair amount of
> C code, I don't see how any of it can interfere with the vacuuming
> business here. I can't say I did my best to create the repoduction
> though -- the explanation above seems to be enough.

Why not just using DISABLE_PAGE_SKIPPING instead here?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2021-03-30 07:28:44 Re: Idea: Avoid JOINs by using path expressions to follow FKs
Previous Message Pavel Stehule 2021-03-30 07:02:39 Re: Idea: Avoid JOINs by using path expressions to follow FKs