Re: Build farm failure

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>, math(at)sai(dot)msu(dot)ru
Subject: Re: Build farm failure
Date: 2007-10-02 01:51:00
Message-ID: 6378.1191289860@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> dugong (icc on ia64) has been failing the contrib installcheck consistently
> since 6 days ago with errors like:
> ERROR: could not fsync segment 0 of relation 1663/40960/41403: No such file or directory

Yeah, I already asked Sergey about this but I guess he's not had time to
poke at it yet:
http://archives.postgresql.org/pgsql-hackers/2007-09/msg01061.php

My theory is that putting an Assert right there is somehow breaking
ForwardFsyncRequest --- maybe it becomes a complete no-op, maybe it
forwards a corrupt request, who knows. The only way that there'd be
any visible problem from that, if you weren't actually performing
pull-the-power-plug tests, would be that lack of forwarding of "revoke"
requests could lead to the bgwriter attempting to fsync files in
already-dropped databases or tablespaces. Which matches the visible
symptoms exactly.

This looks like nothing so much as a compiler bug, particularly given
that we're seeing it with only one compiler on only one platform.
We should study it more carefully, both to look for workarounds and
to file a suitable bug report, but I'll be pretty surprised if it's
really our bug.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-10-02 02:20:31 Re: Build farm failure
Previous Message Gregory Stark 2007-10-02 01:43:19 Re: Build farm failure