Re: [GENERAL] Slow PITR restore

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Jeff Trout <threshar(at)threshar(dot)is-a-geek(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [GENERAL] Slow PITR restore
Date: 2007-12-14 09:39:40
Message-ID: 47624F5C.4020809@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hi,

Alvaro Herrera wrote:
> Simon Riggs wrote:
>
>> ISTM its just autovacuum launcher + Hot Standby mixed.
>
> I don't think you need a launcher at all. Just get the postmaster to
> start a configurable number of wal-replay processes (currently the
> number is hardcoded to 1).

I also see similarity to what I do for Postgres-R: a manager and helper
backends which can be started upon request. Such a scheme is currently
used for autovacuum, I'm using it for replication, it could help for
parallelizing recovery and it certainly helps for parallelizing queries
as discussed in another thread.

Maybe it's worth considering a general framework for such a manager or
auto launcher, as well as helper backends. It certainly depends on the
complexity of that manager, but it should probably better be an external
process.

What all of the helper backends have in common, AFAICT:

- a connection to a database
- no client connection
- superuser privileges

(For parallelized queries, superuser privileges might appear wrong, but
I'm arguing that parallelizing the rights checking isn't worth the
trouble, so the initiating worker backend should do that and only
delegate safe jobs to hepler backends. Or is that a serious limitation
in a way?)

Most code for that already exists, as we already have various helpers.
What's missing, IMO, is a communication channel between the worker and
helper backends as well as between the backends and the manager. That's
needed i.e. for worker backends being able to request helper backends
and feed them with their wishes.

Unix pipes can only be set up between the parent and the child of a
fork, they eat file descriptors, need to copy data to the kernel and
back and IIRC, there were portability issues. That's why I've written
the internal message passing (IMessage) stuff, see -patches [1].

I'm all for unifying such a manager process and generalizing the
requesting and launching of helpers as well as management of their state
(handling died helper processes, keeping a pool of idle helpers which
are already connected to a database, etc..). Most of that already exists
in my Postgres-R code, maybe I can derive a general purpose patch to
start contributing code from Postgres-R?

Comments? Use cases I'm missing?

Regards

Markus

[1]: last time I published IMessage stuff on -patches, WIP:
http://archives.postgresql.org/pgsql-patches/2007-01/msg00578.php

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Zeugswetter Andreas ADI SD 2007-12-14 09:51:35 Re: [GENERAL] Slow PITR restore
Previous Message Ow Mun Heng 2007-12-14 09:36:40 Re: HouseKeeping and vacuum Questions

Browse pgsql-hackers by date

  From Date Subject
Next Message Zeugswetter Andreas ADI SD 2007-12-14 09:51:35 Re: [GENERAL] Slow PITR restore
Previous Message Simon Riggs 2007-12-14 09:21:54 Re: Negative LIMIT and OFFSET?