Re: Something flaky in the "relfilenode mapping" infrastructure

From: Noah Misch <noah(at)leadboat(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Something flaky in the "relfilenode mapping" infrastructure
Date: 2014-06-13 02:12:40
Message-ID: 20140613021240.GA719601@tornado.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 12, 2014 at 02:44:10AM -0400, Tom Lane wrote:
> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> > On 2014-06-12 00:38:36 -0400, Noah Misch wrote:
> >> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedog&dt=2014-06-12%2000%3A17%3A07
>
> > Hm. My guess it's that it's just a 'harmless' concurrency issue. The
> > test currently run in concurrency with others: I think what happens is
> > that the table gets dropped in the other relation after the query has
> > acquired the mvcc snapshot (used for the pg_class) test.
> > But why is it triggering on such a 'unusual' system and not on others?
> > That's what worries me a bit.

I can reproduce a similar disturbance in the test query using gdb and a
concurrent table drop, and the table reported in the prairiedog failure is a
table dropped in a concurrent test group. That explanation may not be the
full story behind these particular failures, but it certainly could cause
similar failures in the future.

Let's prevent this by only reporting rows for relations that still exist after
the query is complete.

> prairiedog is pretty damn slow by modern standards. OTOH, I think it
> is not the slowest machine in the buildfarm; hamster for instance seems
> to be at least a factor of 2 slower. So I'm not sure whether to believe
> it's just a timing issue.

That kernel's process scheduler could be a factor.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
filenode_relation-test-race-v1.patch text/plain 2.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-06-13 02:50:44 Re: Something flaky in the "relfilenode mapping" infrastructure
Previous Message Kyotaro HORIGUCHI 2014-06-13 01:59:36 Re: How to change the pgsql source code and build it??