Quick Links

Re: bg worker: general purpose requirements

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Markus Wanner <markus(at)bluegap(dot)ch>
Cc:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: bg worker: general purpose requirements
Date:	2010-09-20 16:57:56
Message-ID:	AANLkTikHQfBVBZ8LJrmh8TPrChCwd9vXyoN8xV_=VXdB@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Sep 20, 2010 at 11:30 AM, Markus Wanner <markus(at)bluegap(dot)ch> wrote:
> Well, Apache pre-forks 5 processes in total (by default, that is, for
> high volume webservers a higher MinSpareServers setting is certainly not
> out of question). While bgworkers currently needs to fork
> min_spare_background_workers processes per database.
>
> AIUI, that's the main problem with the current architecture.

Assuming that "the main problem" refers more or less to the words "per
database", I agree.

>>> I haven't measured the actual time it takes, but given the use case of a
>>> connection pool, I so far thought it's obvious that this process takes too
>>> long.
>>
>> Maybe that would be a worthwhile exercise...
>
> On my laptop I'm measuring around 18 bgworker starts per second, i.e.
> roughly 50 ms per bgworker start. That's certainly just a ball-park figure..

Gee, that doesn't seem slow enough to worry about to me. If we
suppose that you need 2 * CPUs + spindles processes to fully load the
system, that means you should be able to ramp up from zero to
consuming every available system resource in under a second; except
perhaps on a system with a huge RAID array, which might need 2 or 3
seconds. If you parallelize the worker startup, as you suggest, I'd
think you could knock quite a bit more off of this, but why all the
worry about startup latency? Once the system is chugging along, none
of this should matter very much, I would think. If you need to
repeatedly kill off some workers bound to one database and start some
new ones to bind to a different database, that could be sorta painful,
but if you can actually afford to keep around the workers for all the
databases you care about, it seems fine.

>> How do you accumulate the change sets?
>
> Logical changes get collected at the heapam level. They get serialized
> and streamed (via imessages and a group communication system) to all
> nodes. Application of change sets is highly parallelized and should be
> pretty efficient. Commit ordering is decided by the GCS to guarantee
> consistency across all nodes, conflicts get resolved by aborting the
> later transaction.

Neat stuff.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Re: bg worker: general purpose requirements at 2010-09-20 15:30:06 from Markus Wanner

Responses

Re: bg worker: general purpose requirements at 2010-09-20 17:45:41 from Markus Wanner

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Mark Wong	2010-09-20 17:16:15	Re: compile/install of git
Previous Message	Stephen Frost	2010-09-20 16:51:11	work_mem / maintenance_work_mem maximums