Re: Review: Extra Daemons / bgworker

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Markus Wanner <markus(at)bluegap(dot)ch>
Cc: kaigai(at)kaigai(dot)gr(dot)jp, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review: Extra Daemons / bgworker
Date: 2012-11-28 14:51:18
Message-ID: 20121128145118.GA4333@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Markus Wanner wrote:

Hi Markus,

Many thanks for your review.

> first of all, thank you for bringing this up again and providing a
> patch. My first attempt on that was more than two years ago [1]. As the
> author of a former bgworker patch, I'd like to provide an additional
> review - KaiGai was simply faster to sing up as a reviewer on the
> commitfest app...

I remember your patchset. I didn't look at it for this round, for no
particular reason. I did look at KaiGai's submission from two
commitfests ago, and also at a patch from Simon which AFAIK was never
published openly. Simon's patch merged the autovacuum code to make
autovac workers behave like bgworkers as handled by his patch, just like
you suggest. To me it didn't look all that good; I didn't have the guts
for that, hence the separation. If later bgworkers are found to work
well enough, we can "port" autovac workers to use this framework, but
for now it seems to me that the conservative thing is to leave autovac
untouched.

(As an example, we introduced "ilist" some commits back and changed some
uses to it; but there are still many places where we're using SHM_QUEUE,
or List, or open-coded lists, which we could port to the new
infrastructure, but there's no pressing need to do it.)

> I started my review is based on rev 6d7e51.. from your bgworker branch
> on github, only to figure out later that it wasn't quite up to date. I
> upgraded to bgworker-7.patch. I hope that's the most recent version.

Sorry about that -- forgot to push to github. bgworker-7 corresponds to
commit 0a49a540b which I have just pushed to github.

> # Concept
>
> I appreciate the ability to run daemons that are not connected to
> Postgres shared memory. It satisfies a user request that came up several
> times when I talked about the bgworkers in Postgres-R.
>
> Another good point is the flexible interface via extensions, even
> allowing different starting points for such background workers.

Great.

> One thing I'd miss (for use of that framework in Postgres-R) is the
> ability to start a registered bgworker only upon request, way after the
> system started. So that the bgworker process doesn't even exist until it
> is really needed. I realize that RegisterBackgroundWorker() is still
> necessary in advance to reserve the slots in BackgroundWorkerList and
> shared memory.
>
> As a use case independent of Postgres-R, think of something akin to a
> worker_spi, but wanting that to perform a task every 24h on hundreds of
> databases. You don't want to keep hundreds of processes occupying PGPROC
> slots just perform a measly task every 24h.

Yeah, this is something I specifically kept out initially to keep things
simple.

Maybe one thing to do in this area would be to ensure that there is a
certain number of PGPROC elements reserved specifically for bgworkers;
kind of like autovacuum workers have. That would avoid having regular
clients exhausting slots for bgworkers, and vice versa.

How are you envisioning that the requests would occur?

> # Minor technical issues or questions
>
> In assign_maxconnections, et al, GetNumRegisteredBackgroundWorkers() is
> used in relation to MAX_BACKENDS or to calculate MaxBackends. That seems
> to "leak" the bgworkers that registered with neither
> BGWORKER_SHMEM_ACCESS nor BGWORKER_BACKEND_DATABASE_CONNECTION set. Or
> is there any reason to discount such extra daemon processes?

No, I purposefully let those out, because those don't need a PGPROC. In
fact that seems to me to be the whole point of non-shmem-connected
workers: you can have as many as you like and they won't cause a
backend-side impact. You can use such a worker to connect via libpq to
the server, for example.

> The additional contrib modules auth_counter and worker_spi are missing
> from the contrib/Makefile. If that's intentional, they should probably
> be listed under "Missing".
>
> The auth_counter module leaves worker.bgw_restart_time uninitialized.
>
> Being an example, it might make sense for auth_counter to provide a
> signal that just calls SetLatch() to interrupt WaitLatch() and make
> auth_counter emit its log line upon request, i.e. show how to use SIGUSR1.

KaiGai proposed that we remove auth_counter. I don't see why not; I
mean, worker_spi is already doing most of what auth_counter is doing, so
why not? However, as you say, maybe we need more coding examples.

> The bgw_restart_time doesn't always work (certainly not the way I'm
> expecting it to). For example, if you forget to pass the
> BGWORKER_SHMEM_ACCESS flag, trying to LWLockAcquire() leads to the
> worker being restarted immediately and repeatedly - independent of the
> bgw_restart_time setting. The same holds true if the bgworker exits with
> status 0 or in case it segfaults. Not when exiting with code 1, though.
> Why is that? Especially in case of a segfault or equally "hard" errors
> that can be expected to occur repeatedly, I don't want the worker to be
> restarted that frequently.

Ah, that's just a bug, of course.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Wanner 2012-11-28 15:45:51 Re: Review: Extra Daemons / bgworker
Previous Message Amit kapila 2012-11-28 14:47:44 Re: Proposal for Allow postgresql.conf values to be changed via SQL