From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Daniel Gustafsson <daniel(at)yesql(dot)se> |
Cc: | Magnus Hagander <magnus(at)hagander(dot)net>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Online enabling of checksums |
Date: | 2018-04-06 23:13:56 |
Message-ID: | 20180406231356.l7s6dmdbi76nc7tf@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2018-04-07 01:04:50 +0200, Daniel Gustafsson wrote:
> > I'm fairly certain that the bug here is a simple race condition in the
> > test (not the main code!):
>
> I wonder if it may perhaps be a case of both?
See my other message about the atomic fallback bit.
> > It's
> > exceedingly unsurprising that a 'pg_sleep(1)' is not a reliable way to
> > make sure that a process has finished exiting. Then followup tests fail
> > because the process is still running
>
> I can reproduce the error when building with --disable-atomics, and it seems
> that all the failing members either do that, lack atomic.h, lack atomics or a
> combination.
atomics.h isn't important, it's just relevant for solaris (IIRC). Only
one of the failing ones lacks atomics afaict. See
On 2018-04-06 14:19:09 -0700, Andres Freund wrote:
> Is that an explanation for
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2018-04-06%2019%3A18%3A11
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lousyjack&dt=2018-04-06%2016%3A03%3A01
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2018-04-06%2015%3A46%3A16
> ? Those all don't seem fall under that? Having proper atomics?
So there it's the timing. Note that they didn't always fail either.
> > really? Let's just force the test take at least 6s purely from
> > sleeping?
>
> The test needs continuous reading in a session to try and trigger any bugs in
> read access on the cluster during checksumming, is there a good way to do that
> in the isolationtester? I have failed to find a good way to repeat a step like
> that, but I might be missing something.
IDK, I know this isn't right.
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2018-04-06 23:24:33 | pgsql: Allow insert and update tuple routing and COPY for foreign table |
Previous Message | Daniel Gustafsson | 2018-04-06 23:04:50 | Re: Online enabling of checksums |