From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag |
Date: | 2023-10-16 02:47:22 |
Message-ID: | 2186072.1697424442@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Michael Paquier <michael(at)paquier(dot)xyz> writes:
> The buildfarm has provided some feedback, and the new tests have been
> unstable on mamba:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2023-10-15%2005%3A04%3A21
Yeah. FWIW, I tried to reproduce this on mamba's host, but did not
see it again in 100 or so tries.
> In more details:
> # poll_query_until timed out executing this query:
> # SELECT datname, usename, wait_event FROM pg_stat_activity
> # WHERE backend_type = 'worker_spi dynamic' AND
> # pid = ;
> # with stderr:
> # ERROR: syntax error at or near ";"
> # LINE 3: pid = ;
> So this looks like a hard failure in starting the worker that should
> bypass the role login check. The logs don't offer much information,
> but I think I know what's going on here: at this stage of the tests,
> the number of workers created is 7, very close to the limit of
> max_worker_processes, at 8 by default. So one parallel worker spawned
> by any of the other bgworkers would be enough to prevent the last one
> to start, and mamba has been slow enough in the startup of the static
> workers to show that this could be possible.
I agree that that probably is the root cause, and we should fix it
by bumping up max_worker_processes in this test.
But this failure is annoying in another way. Evidently,
worker_spi_launch returned NULL, which must have been from here:
if (!RegisterDynamicBackgroundWorker(&worker, &handle))
PG_RETURN_NULL();
Why in the world is that "return NULL", and not "ereport(ERROR)"
like all the other failure conditions in that function?
If there's actually a defensible reason for the C code to act
like that, then all the call sites need to have checks for
a null result.
(cc'ing Robert, as this coding originated at 090d0f205.)
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2023-10-16 02:53:56 | Re: Add support for AT LOCAL |
Previous Message | Michael Paquier | 2023-10-16 02:22:38 | Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag |