Re: Something flaky in the "relfilenode mapping" infrastructure

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Something flaky in the "relfilenode mapping" infrastructure
Date: 2014-06-13 02:50:44
Message-ID: 724.1402627844@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Noah Misch <noah(at)leadboat(dot)com> writes:
> On Thu, Jun 12, 2014 at 02:44:10AM -0400, Tom Lane wrote:
>> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>>> On 2014-06-12 00:38:36 -0400, Noah Misch wrote:
>>>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2014-06-12%2000%3A17%3A07

>>> Hm. My guess it's that it's just a 'harmless' concurrency issue. The
>>> test currently run in concurrency with others: I think what happens is
>>> that the table gets dropped in the other relation after the query has
>>> acquired the mvcc snapshot (used for the pg_class) test.
>>> But why is it triggering on such a 'unusual' system and not on others?
>>> That's what worries me a bit.

> I can reproduce a similar disturbance in the test query using gdb and a
> concurrent table drop, and the table reported in the prairiedog failure is a
> table dropped in a concurrent test group. That explanation may not be the
> full story behind these particular failures, but it certainly could cause
> similar failures in the future.

Yeah, that seems like a plausible explanation, since the table shown
in the failure report is one that would be getting dropped concurrently,
and the discrepancy is that we get NULL rather than the expected value
for the pg_filenode_relation result, which is expected if the table is
already dropped when the mapping function is called.

> Let's prevent this by only reporting rows for relations that still exist after
> the query is complete.

I think this is a bad solution though; it risks masking actual problems.

What seems like a better fix to me is to change the test

mapped_oid IS DISTINCT FROM oid

to

mapped_oid <> oid

pg_class.oid will certainly never read as NULL, so what this will do is
allow the single case where the function returns NULL. AFAIK there is
no reason to suppose that a NULL result would mean anything except "the
table's been dropped", so changing it this way will allow only that case
and not any others.

Alternatively, we could do something like you suggest but adjust the
second join so that it suppresses only rows in which mapped_oid is null
*and* there's no longer a matching OID in pg_class. That would provide
additional confidence that the null result is a valid indicator of a
just-dropped table.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-06-13 04:28:46 Re: lo_create(oid, bytea) breaks every extant release of libpq
Previous Message Noah Misch 2014-06-13 02:12:40 Re: Something flaky in the "relfilenode mapping" infrastructure