Re: amcheck (B-Tree integrity checking tool)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Kevin Grittner <kgrittn(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>
Subject: Re: amcheck (B-Tree integrity checking tool)
Date: 2017-02-09 22:32:07
Message-ID: 20170209223207.z5cw3xi6t3aj6xwi@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-02-09 17:12:58 -0500, Robert Haas wrote:
> On Thu, Feb 9, 2017 at 4:57 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Meh. I don't think that's a serious problem, nor is without precedent
> > (say same toplevel DDL with differing lock levels depending on
> > options). Nor do the function names confer that any more efficiently
> > than parameters.
>
> Really? check_btree(blah, true) vs. check_btree(blah, false) is no
> more or less clear than check_btree_for_plausible_errors(blah) vs.
> check_btree_in_unreasonable_detail(blah)?

Except that the proposed names aren't remotely like that... ;).

And I proposed documenting named parameters, and
check_btree(performing_check_requiring_exclusive_locks => true) is just
about as expressive.

> > Again, some parts of the code doing something bad isn't a good argument
> > for doing again. Releasing locks early is a bad pattern, because backend
> > code isn't generally safe against it, we have special code-paths for
> > catalog tables to support it for those.
>
> I don't agree with that. The reason we keep relation locks until the
> end of the transaction is to avoid surprising application code, not
> because the system can't tolerate it. But here it's more likely that
> retaining the lock will surprise the user.

It's both. A bunch of code paths rely on early release ony happening on
catalogs. E.g., and that's just the first thing that comes to mind,
cache invalidations which really only are guaranteed to be delivered and
visibile for other cache accesses, if the lock is only released after
the end of the transaction. Object accesses like relation_open() accept
invalidations *after* acquiring the lock. Combined with lock releases
only happening *after* transaction commits that guarantees an up2date
view of the cache; but if others release the lock earlier and have cache
invalidations, that doesn't work anymore. Now you can argue that it's
possibly harmless here because there's presumably no way this can ever
trigger cache invals - and you're probably right - but this isn't the
only place with such assumption and you'd definitely have to argue *why*
it's right.

There were other issues than this, but right of the top of my head I
only remember:
double
IndexBuildHeapRangeScan(Relation heapRelation,
..
/*
* Since caller should hold ShareLock or better, normally
* the only way to see this is if it was inserted earlier
* in our own transaction. However, it can happen in
* system catalogs, since we tend to release write lock
* before commit there. Give a warning if neither case
* applies.
*/
xwait = HeapTupleHeaderGetXmin(heapTuple->t_data);
if (!TransactionIdIsCurrentTransactionId(xwait))
{
if (!is_system_catalog)
elog(WARNING, "concurrent insert in progress within table \"%s\"",
RelationGetRelationName(heapRelation));

which isn't an issue here, but reinforces my point about the (badly
documented) assumption that we don't release locks on user relations
early.

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2017-02-09 22:47:39 Re: amcheck (B-Tree integrity checking tool)
Previous Message Robert Haas 2017-02-09 22:31:29 Re: Parallel tuplesort (for parallel B-Tree index creation)