Re: plan time of MASSIVE partitioning ...

From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: plan time of MASSIVE partitioning ...
Date: 2010-10-26 15:34:56
Message-ID: 4CC6F520.1030408@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Tom Lane írta:
> Boszormenyi Zoltan <zb(at)cybertec(dot)at> writes:
>
>> The problem is with the two functions in path/equivclass.c,
>> as process_equivalance() and those functions are all walk
>> the tree, and the current RBTree code can only deal with
>> one walk at a time. We need to push/pop the iterator state
>> to be able to serve more than one walkers.
>>
>
> Good luck with that --- the iteration state is embedded in the rbtree.
>
>
>> Also, we need to split out the tree modifying part from
>> process_equivalence() somehow, as the tree walking
>> also cannot deal with node additions and deletions.
>>
>
> That's not happening either. Maybe you need to think of some other data
> structure to use. Hashtable maybe? dynahash.c at least has reasonably
> well-defined limitations in this area.
>
> regards, tom lane
>

thank you very much for pointing me to dynahash, here is the
next version that finally seems to work.

Two patches are attached, the first is the absolute minimum for
making it work, this still has the Tree type for canon_pathkeys
and eq_classes got the same treatment as join_rel_list/join_rel_hash
has in the current sources: if the list grows larger than 32, a hash table
is created. It seems to be be enough for doing in for
get_eclass_for_sort_expr()
only, the other users of eq_classes aren't bothered by this change.

The total speedup figure is in the 70+ percent range from these
two changes, a little later GIT version than the previous tree
I tested with before shows 1.74 vs. 0.41 second runtime for the
example query. These are with asserts and profiling enabled of
course. Without asserts and profiling enabled, the "time psql"
figures are:

$ time psql -p 54321 -c "explain select * from inh_parent where
timestamp1 between '2010-04-06' and '2010-06-25' order by timestamp2"
>/dev/null

real 0m1.932s
user 0m0.035s
sys 0m0.002s

vs.

real 0m0.630s
user 0m0.033s
sys 0m0.002s

The second patch contains extra infrastructure for the Tree type,
it's currently unused, it was created for experimenting with eq_classes
being a tree. It may be useful for someone, though.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/

Attachment Content-Type Size
9.1-planner-speedup-v5.patch text/x-patch 28.7 KB
9.1-planner-speedup-v5-extra-ifs.patch text/x-patch 75.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2010-10-26 15:41:59 Re: Postgres insert performance and storage requirement compared to Oracle
Previous Message Jeff Janes 2010-10-26 15:22:38 Re: xlog.c: WALInsertLock vs. WALWriteLock