Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes
Date: 2018-08-29 21:58:19
Message-ID: 25024.1535579899@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> We could perhaps fix this with a less invasive change than what you
> suggest here, by attacking the missed-call-due-to-recursion aspect
> rather than monkeying with how relcache rebuild itself works.

Seeing that rearranging the relcache rebuild logic is looking less than
trivial, I thought it'd be a good idea to investigate this alternative
approach.

Essentially, the problem here is that lmgr.c supposes that
AcceptInvalidationMessages is an atomic operation, which it most
certainly isn't. In reality, it's unsafe to suppose that we can skip
reading incoming inval messages until we have *completed*
AcceptInvalidationMessages, not just invoked it.

However, we can fix that very easily, by retaining one additional bit
of state in each LOCALLOCK entry, which records whether we know we have
completed AcceptInvalidationMessages at least once subsequent to obtaining
that lock. In the attached draft patch, I refer to that state as having
"cleared" the lock, but I'm not super pleased with that terminology ---
anybody have a better idea?

A small problem with the lock.c API as it stands is that we'd have to do
an additional hashtable lookup to re-find the relevant LOCALLOCK entry.
In the attached, I fixed that by extending LockAcquireExtended's API
to allow the caller to obtain a pointer to the LOCALLOCK entry, making
it trivially cheap to set the flag after we finish
AcceptInvalidationMessages. LockAcquireExtended has only one external
caller right now, which makes me think it's OK to change its API rather
than introduce YA variant entry point, but that could be argued.

There are two ways we could adjust the return value from
LockAcquire(Extended) to cope with this requirement. What I did here
was to add an additional LockAcquireResult code, so that "already held"
can be distinguished from "already held and cleared". But we don't
really need to make that distinction, at least not in the core code.
So another way would be to redefine LOCKACQUIRE_ALREADY_HELD as meaning
"held and cleared", and then just return LOCKACQUIRE_OK if you haven't
called MarkLockClear for the lock. I'm not entirely sure which way is
less likely to break any third-party code that might be inspecting the
result of LockAcquire. You could probably argue it either way depending
on what you think the third-party code is doing with the result.

Anyway, the attached appears to fix the problem: Andres' test script
fails in fractions of a second with HEAD on my workstation, but it's
run error-free for quite awhile now with this patch. And this is sure
a lot simpler than any relcache rebuild refactoring we're likely to come
up with.

Discuss.

regards, tom lane

PS: this also fixes what seems to me to be a buglet in the fast-path
locks stuff: there are failure paths out of LockAcquire that don't
remove the failed LOCALLOCK entry. Not any more.

Attachment Content-Type Size
fix-missed-inval-msg-accepts-1.patch text/x-diff 10.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-08-29 22:15:20 Re: "Write amplification" is made worse by "getting tired" while inserting into nbtree secondary indexes (Was: Why B-Tree suffix truncation matters)
Previous Message Thomas Munro 2018-08-29 21:41:42 Re: Use C99 designated initializers for some structs