Re: APR 1.0 released

From: Mike Rylander <mrylander(at)gmail(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: APR 1.0 released
Date: 2004-10-08 20:18:12
Message-ID: b918cf3d04100813187dae55d7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

A while back I was looking the backend code in preparation to start
beginning to look at parallelization techniques for PG ;)... My
thought was instead of trying to parallelize each individual plan node
(multi-process sort, etc.) I would look at creating worker
threads/processes for each plan node as a whole. For example, take a
plan that looks like this:

QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan metarecord_field_entry_view (cost=5.32..4038.80
rows=21 width=112)
-> Append (cost=5.32..4038.59 rows=21 width=112)
-> Subquery Scan "*SELECT* 1" (cost=5.32..5.33 rows=1 width=74)
-> HashAggregate (cost=5.32..5.32 rows=1 width=74)
-> Index Scan using tmr_fe_field on
metarecord_title_field_entry (cost=0.00..5.31 rows=1 width=74)
Index Cond: (field = 'added_entry_author'::text)
Filter: ((field_class = 'title'::text) AND
(value ~~* '% joe %'::text))
-> Subquery Scan "*SELECT* 2" (cost=4031.02..4031.20
rows=18 width=62)
-> HashAggregate (cost=4031.02..4031.02 rows=18 width=62)
-> Seq Scan on metarecord_author_field_entry
(cost=0.00..4030.79 rows=18 width=62)
Filter: ((field_class = 'author'::text) AND
(field = 'added_entry_author'::text) AND (value ~~* '% joe %'::text))
-> Subquery Scan "*SELECT* 3" (cost=2.03..2.04 rows=1 width=81)
-> HashAggregate (cost=2.03..2.03 rows=1 width=81)
-> Index Scan using smr_fe_field on
metarecord_subject_field_entry (cost=0.00..2.02 rows=1 width=81)
Index Cond: (field = 'added_entry_author'::text)
Filter: ((field_class = 'subject'::text)
AND (value ~~* '% joe %'::text))
-> Subquery Scan "*SELECT* 4" (cost=0.01..0.02 rows=1 width=112)
-> HashAggregate (cost=0.01..0.01 rows=1 width=112)
-> Seq Scan on metarecord_misc_field_entry
(cost=0.00..0.00 rows=1 width=112)
Filter: ((field_class = 'misc'::text) AND
(field = 'added_entry_author'::text) AND (value ~~* '% joe %'::text))

The optimizer would look at each node as it walked down the tree and
see that 'Append' node has multiple peer child nodes. It would look
at the cost estimate of the child nodes and if that cost is greater
that the total average cost across all nodes it would spin off a
worker thread/process to handle gathering the sub-resultset.

In any case, I've no time to even *start* looking into something like
that. But even if I did, am I all wet?

--miker

On Fri, 8 Oct 2004 11:56:27 -0400 (EDT), Bruce Momjian
<pgman(at)candle(dot)pha(dot)pa(dot)us> wrote:
> Neil Conway wrote:
> > Marc G. Fournier wrote:
> > > Do we have 'make backend thread safe' listed yet? As I recall it, until
> > > that gets done, parallelization of anything was considered to be a
> > > relatively onerous task, no?
> >
> > ISTM there's no reason we couldn't parallelize query execution using the
> > same IPC techniques that we use now. What would be the advantage of
> > using threads instead?
>
> Separate processes. Yes, we could do that too and the item mentions that.
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
> + If your life is a hard drive, | 13 Roberts Road
> + Christ can be your backup. | Newtown Square, Pennsylvania 19073
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gaetano Mendola 2004-10-08 21:05:01 Re: more dirmod CYGWIN (was: APR 1.0 released)
Previous Message James Robinson 2004-10-08 20:02:26 Thank you ...