Re: bg worker: general purpose requirements

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Markus Wanner <markus(at)bluegap(dot)ch>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: bg worker: general purpose requirements
Date: 2010-09-17 15:52:19
Message-ID: AANLkTinaAKp90uOSKNAjoDFWZXRrouGCwK0TUbxvBLA4@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 17, 2010 at 11:29 AM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:
> autonomous transactions: max. one per normal backend (correct?), way fewer
> should suffice in most cases, only control data to be passed around

Technically, you could start an autonomous transaction from within an
autonomous transaction, so I don't think there's a hard maximum of one
per normal backend. However, I agree that the expected case is to not
have very many.

> All of the potential users of bgworkers benefit from a pre-connected
> bgworker. Meaning having at least one spare bgworker around per database
> could be beneficial, potentially more depending on how often spike loads
> occur. As long as there are only few databases, it's easily possible to have
> at least one spare process around per database, but with thousands of
> databases, that might get prohibitively expensive (not sure where the
> boundary between win vs loose is, though. Idle backends vs. connection
> cost).

I guess it depends on what your goals are. If you're optimizing for
ability to respond quickly to a sudden load, keeping idle backends
will probably win even when the number of them you're keeping around
is fairly high. If you're optimizing for minimal overall resource
consumption, though, you'll not be as happy about that. What I'm
struggling to understand is this: if there aren't any preforked
workers around when the load hits, how much does it slow things down?
I would have thought that a few seconds to ramp up to speed after an
extended idle period (5 minutes, say) would be acceptable for most of
the applications you mention. Is the ramp-up time longer than that,
or is even that much delay unacceptable for Postgres-R, or is there
some other aspect to the problem I'm failing to grasp? I can tell you
have some experience tuning this so I'd like to try to understand
where you're coming from.

> However, I feel like this gives less control over how the bgworkers are
> used. For example, I'd prefer to be able to prevent the system from
> allocating all bgworkers to a single database at once.

I think this is an interesting example, and worth some further
thought. I guess I don't really understand how Postgres-R uses these
bgworkers. Are you replicating one transaction at a time, or how does
the data get sliced up? I remember you mentioning
sync/async/eager/other replication strategies previously - do you have
a pointer to some good reading on that topic?

> Hope that sheds some more light on how bgworkers could be useful. Maybe I
> just need to describe the job handling features of the coordinator better as
> well? (Simon also requested better documentation...)

That seems like it would be useful, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-09-17 15:55:41 Re: Report: removing the inconsistencies in our CVS->git conversion
Previous Message Simon Riggs 2010-09-17 15:49:44 Re: Configuring synchronous replication