Skip site navigation (1) Skip section navigation (2)

Re: plan time of MASSIVE partitioning ...

From: Boszormenyi Zoltan <zb(at)cybertec(dot)at>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: plan time of MASSIVE partitioning ...
Date: 2010-10-28 11:29:30
Message-ID: 4CC95E9A.5000605@cybertec.at (view raw or flat)
Thread:
Lists: pgsql-hackers
Boszormenyi Zoltan írta:
> Boszormenyi Zoltan írta:
>   
>> Boszormenyi Zoltan írta:
>>   
>>     
>>> Heikki Linnakangas írta:
>>>   
>>>     
>>>       
>>>> On 26.10.2010 18:34, Boszormenyi Zoltan wrote:
>>>>     
>>>>       
>>>>         
>>>>> thank you very much for pointing me to dynahash, here is the
>>>>> next version that finally seems to work.
>>>>>
>>>>> Two patches are attached, the first is the absolute minimum for
>>>>> making it work, this still has the Tree type for canon_pathkeys
>>>>> and eq_classes got the same treatment as join_rel_list/join_rel_hash
>>>>> has in the current sources: if the list grows larger than 32, a hash
>>>>> table
>>>>> is created. It seems to be be enough for doing in for
>>>>>       get_eclass_for_sort_expr()
>>>>> only, the other users of eq_classes aren't bothered by this change.
>>>>>       
>>>>>         
>>>>>           
>>>> That's better, but can't you use dynahash for canon_pathkeys as well?
>>>>     
>>>>       
>>>>         
>>> Here's a purely dynahash solution. It's somewhat slower than
>>> the tree version, 0.45 vs 0.41 seconds in the cached case for the
>>> previously posted test case.
>>>   
>>>     
>>>       
>> And now in context diff, sorry for my affection towards unified diffs. :-)
>>   
>>     
>
> A little better version, no need for the heavy hash_any, hash_uint32
> on the lower 32 bits on pk_eclass is enough. The profiling runtime
> is now 0.42 seconds vs the previous 0.41 seconds for the tree version.
>
> Best regards,
> Zoltán Böszörményi
>   

Btw, the top entries in the current gprof output are:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total          
 time   seconds   seconds    calls  ms/call  ms/call  name   
 19.05      0.08     0.08      482     0.17     0.29 
add_child_rel_equivalences
 11.90      0.13     0.05  1133447     0.00     0.00  bms_is_subset
  9.52      0.17     0.04   331162     0.00     0.00 
hash_search_with_hash_value
  7.14      0.20     0.03   548971     0.00     0.00  AllocSetAlloc
  4.76      0.22     0.02     2858     0.01     0.01  get_tabstat_entry
  4.76      0.24     0.02     1136     0.02     0.02  tzload

This means add_child_rel_equivalences() is still takes
too much time, the previously posted test case calls this
function 482 times, it's called for almost  every 10th entry
added to eq_classes. The elog() I put into this function says
that at the last call list_length(eq_classes) == 4754.

Best regards,
Zoltán Böszörményi

-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
     http://www.postgresql.at/


In response to

Responses

pgsql-hackers by date

Next:From: Heikki LinnakangasDate: 2010-10-28 11:35:23
Subject: Re: plan time of MASSIVE partitioning ...
Previous:From: Boszormenyi ZoltanDate: 2010-10-28 10:54:59
Subject: Re: plan time of MASSIVE partitioning ...

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group