Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, rootcause000(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18146: Rows reappearing in Tables after Auto-Vacuum Failure in PostgreSQL on Windows
Date: 2024-10-29 07:48:46
Message-ID: ZyCTXqPG_uGvLJtZ@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Sep 10, 2024 at 12:43:16PM +1200, Thomas Munro wrote:
> Here is an experiment to try that out. The requirement to call
> smgrnblock() beforehand is still slightly magical, but written in
> black and white. I guess it could use an assertion cross-check on the
> number of opened segments...

I was looking at what you have here, and the split with
smgrtruncatefrom() to do the allocations in _mdfd_openseg() for
_mdfd_segpath() and _fdvec_resize() before entering in the critical
section for the physical truncation is elegant.

+mdtruncate(SMgrRelation reln, ForkNumber forknum,
+ BlockNumber curnblk, BlockNumber nblocks)
+ * all! That step can't be done in a critical section.

Perhaps this should have an assert based on CritSectionCount==0 to
force the rule.

Don't you think that we'd better have a regression test on HEAD at
least? It should not be complicated. I can create one if you want,
perhaps for later if we want to catch the next minor release train.

> I don't actually know of such an extension myself. I suppose we could
> add a new member at the end called smgr_truncatefrom, and have
> smgrtruncatefrom() call that if it is non-NULL (md's case), and the
> existing smgr_truncate function pointer if it doesn't (ie, some
> hypothetical external monkey-patching smgr replacement). Hypothetical
> forks of PostgreSQL might be more likely to have used this
> interception point, but wouldn't have quite the same ABI problem
> (they'd adjust their function when rebasing on a minor release, but
> they might also prefer if the old function prototype still worked, or
> maybe they'd have some version of this bug themselves and want to be
> able to fix it...).

Making folks aware of the problem sounds kind of sensible seen from
here. In short, changing the signature of smgr_truncate() in a minor
release to fix what's a severe data corruption issue takes priority
IMO.
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2024-10-29 12:22:26 Re: BUG #18677: numeric values in arrays are stored incorrectly
Previous Message Muhammad Waqas 2024-10-29 05:09:11 Re: BUG #18676: Execute function while selecting from table with partial index using this function.