Re: Incomplete freezing when truncating a relation during vacuum

From: Noah Misch <noah(at)leadboat(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Incomplete freezing when truncating a relation during vacuum
Date: 2013-11-30 05:40:06
Message-ID: 20131130054006.GA1108717@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > On Wed, Nov 27, 2013 at 02:14:53PM +0100, Andres Freund wrote:
> > > With regard to fixing things up, ISTM the best bet is heap_prune_chain()
> > > so far. That's executed b vacuum and by opportunistic pruning and we
> > > know we have the appropriate locks there. Looks relatively easy to fix
> > > up things there. Not sure if there are any possible routes to WAL log
> > > this but using log_newpage()?
> > > I am really not sure what the best course of action is :(

Based on subsequent thread discussion, the plan you outline sounds reasonable.
Here is a sketch of the specific semantics of that fixup. If a HEAPTUPLE_LIVE
tuple has XIDs older than the current relfrozenxid/relminmxid of its relation
or newer than ReadNewTransactionId()/ReadNextMultiXactId(), freeze those XIDs.
Do likewise for HEAPTUPLE_DELETE_IN_PROGRESS, ensuring a proper xmin if the
in-progress deleter aborts. Using log_newpage_buffer() seems fine; there's no
need to optimize performance there. (All the better if we can, with minimal
hacks, convince heap_freeze_tuple() itself to log the right changes.)

I am wary about the performance loss of doing these checks in every
heap_prune_chain() call, for all time. If it's measurable, can we can shed
the overhead once corrections are done? Maybe bump the page layout version
and skip the checks for v5 pages? (Ugh.)

Time is tight to finalize this, but it would be best to get this into next
week's release. That way, the announcement, fix, and mitigating code
pertaining to this data loss bug all land in the same release. If necessary,
I think it would be worth delaying the release, or issuing a new release a
week or two later, to closely align those events. That being said, I'm
prepared to review a patch in this area over the weekend.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeffrey Walton 2013-11-30 08:27:28 Re: fe-secure.c and SSL/TLS
Previous Message Alvaro Herrera 2013-11-30 01:15:18 Re: MultiXact truncation, startup et al.