From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, "'amitlangote09(at)gmail(dot)com'" <amitlangote09(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Test of a partition with an incomplete detach has a timing issue |
Date: | 2021-05-24 18:21:04 |
Message-ID: | 1012540.1621880464@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> writes:
> On 2021-May-24, osumi(dot)takamichi(at)fujitsu(dot)com wrote:
>> t
>> -step s2detach: <... completed>
>> -error in steps s1cancel s2detach: ERROR: canceling statement due to user request
>> step s1c: COMMIT;
>> +step s2detach: <... completed>
>> +error in steps s1c s2detach: ERROR: canceling statement due to user request
> Uh, how annoying. If I understand correctly, I agree that this is a
> timing issue: sometimes it is fast enough that the cancel is reported
> together with its own step, but other times it takes longer so it is
> reported with the next command of that session instead, s1c (commit).
Yeah, we see such failures in the buildfarm with various isolation
tests; some recent examples:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2021-05-23%2019%3A43%3A04
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2021-05-08%2006%3A34%3A13
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=anole&dt=2021-04-29%2009%3A43%3A04
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=gharial&dt=2021-04-22%2021%3A24%3A02
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=wrasse&dt=2021-04-21%2010%3A38%3A32
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=fossa&dt=2021-04-08%2019%3A36%3A06
I remember having tried to rewrite the isolation tester to eliminate
the race condition, without success (and I don't seem to have kept
my notes, which now I regret).
However, the existing hazards seem to hit rarely enough to not be
much of a problem. We might need to see if we can rejigger the
timing in this test to make it a little more stable.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2021-05-24 18:21:19 | Re: Performance degradation of REFRESH MATERIALIZED VIEW |
Previous Message | Alvaro Herrera | 2021-05-24 18:07:23 | Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options |