From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, "Leung, Anthony" <antholeu(at)amazon(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Allow non-superuser to cancel superuser tasks. |
Date: | 2024-07-10 05:14:55 |
Message-ID: | Zo4YzzUTUz40r00L@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Jul 09, 2024 at 01:12:59PM -0500, Nathan Bossart wrote:
> I've committed 0001. It looks like 0002 failed CI testing [0], but I
> haven't investigated why.
>
> [0] https://cirrus-ci.com/task/5668467599212544
Nice catch by the CI. This looks like a race condition to me. I
think that we should wait for the autovacuum worker to exit, and then
scan the server logs we expect.
For this failure, look at the timestamps of the server logs:
2024-07-08 12:48:23.271 UTC [32697][client backend]
[006_signal_autovacuum.pl][11/3:0] LOG: statement: SELECT
pg_terminate_backend(32672);
2024-07-08 12:48:23.275 UTC [32697][client backend]
[006_signal_autovacuum.pl][:0] LOG: disconnection: session time:
0:00:00.018 user=postgres database=postgres host=[local]
2024-07-08 12:48:23.278 UTC [32672][autovacuum worker] FATAL:
terminating autovacuum process due to administrator command
And then the timestamp of the tests:
[12:48:23.277](0.058s) not ok 2 - autovacuum worker signaled with
pg_signal_autovacuum_worker granted
We check for the contents of the logs 1ms before they are generated,
hence failing the lookup check because the test is faster than the
backend.
Like what we are doing in 010_pg_basebackup.pl, we could do a
poll_query_until() until the PID of the autovacuum worker is gone from
pg_stat_activity before fetching the logs as ProcessInterrupts() stuff
would be logged before the process exits, say:
+# Wait for the autovacuum worker to exit before scanning the logs.
+$node->poll_query_until('postgres',
+ "SELECT count(*) = 0 FROM pg_stat_activity "
+ . "WHERE pid = $av_pid AND backend_type = 'autovacuum worker';");
That gives something like the attached. Does that look correct to
you?
--
Michael
Attachment | Content-Type | Size |
---|---|---|
v7-0002-Add-tap-test-for-pg_signal_autovacuum-role.patch | text/x-diff | 4.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2024-07-10 05:24:29 | Re: Allow non-superuser to cancel superuser tasks. |
Previous Message | Kirill Reshke | 2024-07-10 05:03:04 | Re: Allow non-superuser to cancel superuser tasks. |