Skip site navigation (1) Skip section navigation (2)

Re: updated hash functions for postgresql v1

From: "CK Tan" <cktan(at)greenplum(dot)com>
To: "Luke Lonergan" <LLonergan(at)greenplum(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>,"Kenneth Marshall" <ktm(at)rice(dot)edu>,pgsql-patches(at)postgresql(dot)org,twraney(at)comcast(dot)net,neilc(at)samurai(dot)com
Subject: Re: updated hash functions for postgresql v1
Date: 2007-10-28 20:19:46
Message-ID: 51B951D9-97C3-47E1-A331-64A38BD06C16@greenplum.com (view raw or flat)
Thread:
Lists: pgsql-patches
Hi, this query on TPCH 1G data gets about 5% improvement.

select count (*) from (select l_orderkey, l_partkey, l_comment,
count(l_tax) from lineitem group by 1, 2, 3) tmpt;

Regards,
-cktan


On Oct 28, 2007, at 1:17 PM, Luke Lonergan wrote:

> We just applied this and saw a 5 percent speedup on a hash  
> aggregation query with four colums in a 'group by' clause run  
> against a single TPC-H table (lineitem).
>
> CK - can you post the query?
>
> - Luke
>
> Msg is shrt cuz m on ma treo
>
>  -----Original Message-----
> From:   Simon Riggs [mailto:simon(at)2ndquadrant(dot)com]
> Sent:   Sunday, October 28, 2007 04:11 PM Eastern Standard Time
> To:     Kenneth Marshall
> Cc:     pgsql-patches(at)postgresql(dot)org; twraney(at)comcast(dot)net;  
> neilc(at)samurai(dot)com
> Subject:        Re: [PATCHES] updated hash functions for postgresql v1
>
> On Sun, 2007-10-28 at 13:05 -0500, Kenneth Marshall wrote:
> > On Sun, Oct 28, 2007 at 05:27:38PM +0000, Simon Riggs wrote:
> > > On Sat, 2007-10-27 at 15:15 -0500, Kenneth Marshall wrote:
> > > > Its features include a better and faster hash function.
> > >
> > > Looks very promising. Do you have any performance test results  
> to show
> > > it really is faster, when compiled into Postgres? Better  
> probably needs
> > > some definition also; in what way are the hash functions better?
> > >
> > > --
> > >   Simon Riggs
> > >   2ndQuadrant  http://www.2ndQuadrant.com
> > >
> > The new hash function is roughly twice as fast as the old  
> function in
> > terms of straight CPU time. It uses the same design as the current
> > hash but provides code paths for aligned and unaligned access as  
> well
> > as separate mixing functions for different blocks in the hash run
> > instead of having one general purpose block. I think the speed will
> > not be an obvious win with smaller items, but will be very important
> > when hashing larger items (up to 32kb).
> >
> > Better in this case means that the new hash mixes more thoroughly
> > which results in less collisions and more even bucket distribution.
> > There is also a 64-bit varient which is still faster since it can
> > take advantage of the 64-bit processor instruction set.
>
> Ken, I was really looking for some tests that show both of the above
> were true. We've had some trouble proving the claims of other  
> algorithms
> before, so I'm less inclined to take those things at face value.
>
> I'd suggest tests with Integers, BigInts, UUID, CHAR(20) and CHAR 
> (100).
> Others may have different concerns.
>
> -- 
>   Simon Riggs
>   2ndQuadrant  http://www.2ndQuadrant.com
>
>
> ---------------------------(end of  
> broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
>                 http://www.postgresql.org/about/donate
>

In response to

Responses

pgsql-patches by date

Next:From: Simon RiggsDate: 2007-10-28 20:45:21
Subject: Re: updated hash functions for postgresql v1
Previous:From: Luke LonerganDate: 2007-10-28 20:17:36
Subject: Re: updated hash functions for postgresql v1

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group