Re: Test of a partition with an incomplete detach has a timing issue

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
Cc: "'amitlangote09(at)gmail(dot)com'" <amitlangote09(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Test of a partition with an incomplete detach has a timing issue
Date: 2021-05-24 18:07:12
Message-ID: 20210524180712.GA13311@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-May-24, osumi(dot)takamichi(at)fujitsu(dot)com wrote:

> Also, I've gotten some logs left.
> * src/test/isolation/output_iso/regression.out
>
> test detach-partition-concurrently-1 ... ok 682 ms
> test detach-partition-concurrently-2 ... ok 321 ms
> test detach-partition-concurrently-3 ... FAILED 1084 ms
> test detach-partition-concurrently-4 ... ok 1078 ms
> test fk-contention ... ok 77 ms
>
> * src/test/isolation/output_iso/regression.diffs
>
> diff -U3 /(where/I/put/PG)/src/test/isolation/expected/detach-partition-concurrently-3.out /(where/I/put/PG)/src/test/isolation/output_iso/results/detach-partition-concurrently-3.out
> --- /(where/I/put/PG)/src/test/isolation/expected/detach-partition-concurrently-3.out 2021-05-24 03:30:15.735488295 +0000
> +++ /(where/I/put/PG)/src/test/isolation/output_iso/results/detach-partition-concurrently-3.out 2021-05-24 04:46:48.851488295 +0000
> @@ -12,9 +12,9 @@
> pg_cancel_backend
>
> t
> -step s2detach: <... completed>
> -error in steps s1cancel s2detach: ERROR: canceling statement due to user request
> step s1c: COMMIT;
> +step s2detach: <... completed>
> +error in steps s1c s2detach: ERROR: canceling statement due to user request

Uh, how annoying. If I understand correctly, I agree that this is a
timing issue: sometimes it is fast enough that the cancel is reported
together with its own step, but other times it takes longer so it is
reported with the next command of that session instead, s1c (commit).

I suppose a fix would imply that the error report waits until after the
"cancel" step is over, but I'm not sure how to do that.

Maybe we can change the "cancel" query to something like

SELECT pg_cancel_backend(pid), somehow_wait_for_detach_to_terminate() FROM d3_pid;

... where maybe that function can check the "state" column in s3's
pg_stat_activity row? I'll give that a try.

--
Álvaro Herrera 39°49'30"S 73°17'W
"That sort of implies that there are Emacs keystrokes which aren't obscure.
I've been using it daily for 2 years now and have yet to discover any key
sequence which makes any sense." (Paul Thomas)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-05-24 18:07:23 Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options
Previous Message Julien Rouhaud 2021-05-24 17:22:42 Re: Commitfest app vs. pgsql-docs