Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)
Date: 2010-08-16 01:25:54
Message-ID: 10048.1281921954@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Oh, sorry. I was thinking we were talking about complete truncation
> rather than partial truncation. I'm still pretty unhappy with the
> proposed fix, though, because it gives up performance in a broad range
> of cases to cater to an extremely narrow failure case.

It doesn't matter: broken is broken, and failure to recover from a
truncate() error is broken. You mustn't think that this is a
Windows-only problem.

> Considering
> the rarity of the proposed problem, are we sure that it isn't better
> to adopt a solution like what Heikki proposed? If truncation fails,
> try to zero the pages; if that also fails, PANIC.

... and PANIC is absolutely, entirely, 100% unacceptable here. I don't
think you understand the context. We've already written the truncate
action to WAL (as we must: WAL before data change). If we PANIC, that
means we'll PANIC again during WAL replay. So that solution means a
down, and perhaps unrecoverably broken, database. There's also the
little problem that "zero the page" is only a safe recovery action for
heap pages not index pages; smgrtruncate is certainly not entitled to
assume its working on a heap.

I think the performance argument is based on fear not facts anyway.
In practice, in modern installations, you're only going to be talking
about marginally more work in autovacuum. ISTM we should fix the bug,
in a simple/reliable/backpatchable way, and then anyone who's worried
about performance can measure the size of the problem, and try to
think of a workable solution if he doesn't like the results.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Robert Haas 2010-08-16 01:58:13 Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)
Previous Message Tom Lane 2010-08-16 01:12:48 Re: Unable to create serial column even with permissions

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-08-16 01:58:13 Re: DropRelFileNodeBuffers API change (was Re: [BUGS] BUG #5599: Vacuum fails due to index corruption issues)
Previous Message Robert Haas 2010-08-16 01:13:46 Re: patch: utf8_to_unicode (trivial)