Re: Parallel postgresql

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Hans-Jürgen Schönig <hs(at)cybertec(dot)at>
Cc: Martin Rusoff <mrusoff(at)columbus(dot)rr(dot)com>, pgsql-hackers(at)postgresql(dot)org, eg(at)cybertec(dot)at
Subject: Re: Parallel postgresql
Date: 2003-10-14 17:21:44
Message-ID: 200310141721.h9EHLil09516@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hans-Jrgen Schnig wrote:
> Bruce Momjian wrote:
> > Martin Rusoff wrote:
> >
> >>I was just contemplating how to make postgres parallel (for DSS
> >>applications)... Has anyone done work on this? It looks to me like there
> >>are a couple of obvious places to add parallel operation:
> >>
> >>Stage 1) I/O , perhaps through MPIO - would improve tablescanning and
> >>load/unload operations. One (or more) Postgresql servers would use
> >>MPIO/ROMIO to access a parallel file system like PVFS or GPFS(IBM).
> >>
> >>Stage 2) Parallel Postgres Servers, with the postmaster spawning off the
> >>server on a different node (possibly borrowing some code from GNU queue)
> >>and doing any buffer twiddling with RPC for that connection, The client
> >>connection would still be through the proxy on the postmaster node? (kind
> >>of like MOSIX)
> >
> >
> > One idea would be to throw parts of the executor (like a table sort) to
> > different machines or to different processors on the same machine,
> > perhaps via dblink. You could use threads to send several requests and
> > wait for their results.
> >
> > Threading the entire backend would be hard, but we could thread some
> > parts of it by having slave backends doing some of the work in parallel.
>
>
>
> This would be nice - especially for huge queries needed in warehouses.
> Maybe it could even make sense to do things in par. if there is just one
> machine (e.g. computing a function while a sort process is waiting for
> I/O or so).
>
> Which operations can run in par.? What do you think?
> I guess implementing something like that means 20 years more work on the
> planner ...

I would think a very expensive function call could already be done in
this way, though you can't do SQL in the function because the visiblity
rules and commit/abort handling aren't pass down to the child --- that
would severely limit what could be done in a child --- the only logical
thing would be some function that calls an external program to send
email or something. We could implement something to pass the parent pid
down to the child, and the child could use that for visibility rules and
maybe commit/abort if we used the parent xid to stamp any rows modified
by the child. Of course, anything I/O bound wouldn't benefit from this.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mario Weilguni 2003-10-14 17:33:31 Stupid index idea...
Previous Message Josh Berkus 2003-10-14 17:17:55 Re: Request for Info.