Re: updated hash functions for postgresql v1

From: "CK Tan" <cktan(at)greenplum(dot)com>
To: "Luke Lonergan" <LLonergan(at)greenplum(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, "Kenneth Marshall" <ktm(at)rice(dot)edu>, pgsql-patches(at)postgresql(dot)org, twraney(at)comcast(dot)net, neilc(at)samurai(dot)com
Subject: Re: updated hash functions for postgresql v1
Date: 2007-10-28 20:19:46
Message-ID: 51B951D9-97C3-47E1-A331-64A38BD06C16@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Hi, this query on TPCH 1G data gets about 5% improvement.

select count (*) from (select l_orderkey, l_partkey, l_comment,
count(l_tax) from lineitem group by 1, 2, 3) tmpt;

Regards,
-cktan

On Oct 28, 2007, at 1:17 PM, Luke Lonergan wrote:

> We just applied this and saw a 5 percent speedup on a hash
> aggregation query with four colums in a 'group by' clause run
> against a single TPC-H table (lineitem).
>
> CK - can you post the query?
>
> - Luke
>
> Msg is shrt cuz m on ma treo
>
> -----Original Message-----
> From: Simon Riggs [mailto:simon(at)2ndquadrant(dot)com]
> Sent: Sunday, October 28, 2007 04:11 PM Eastern Standard Time
> To: Kenneth Marshall
> Cc: pgsql-patches(at)postgresql(dot)org; twraney(at)comcast(dot)net;
> neilc(at)samurai(dot)com
> Subject: Re: [PATCHES] updated hash functions for postgresql v1
>
> On Sun, 2007-10-28 at 13:05 -0500, Kenneth Marshall wrote:
> > On Sun, Oct 28, 2007 at 05:27:38PM +0000, Simon Riggs wrote:
> > > On Sat, 2007-10-27 at 15:15 -0500, Kenneth Marshall wrote:
> > > > Its features include a better and faster hash function.
> > >
> > > Looks very promising. Do you have any performance test results
> to show
> > > it really is faster, when compiled into Postgres? Better
> probably needs
> > > some definition also; in what way are the hash functions better?
> > >
> > > --
> > > Simon Riggs
> > > 2ndQuadrant http://www.2ndQuadrant.com
> > >
> > The new hash function is roughly twice as fast as the old
> function in
> > terms of straight CPU time. It uses the same design as the current
> > hash but provides code paths for aligned and unaligned access as
> well
> > as separate mixing functions for different blocks in the hash run
> > instead of having one general purpose block. I think the speed will
> > not be an obvious win with smaller items, but will be very important
> > when hashing larger items (up to 32kb).
> >
> > Better in this case means that the new hash mixes more thoroughly
> > which results in less collisions and more even bucket distribution.
> > There is also a 64-bit varient which is still faster since it can
> > take advantage of the 64-bit processor instruction set.
>
> Ken, I was really looking for some tests that show both of the above
> were true. We've had some trouble proving the claims of other
> algorithms
> before, so I'm less inclined to take those things at face value.
>
> I'd suggest tests with Integers, BigInts, UUID, CHAR(20) and CHAR
> (100).
> Others may have different concerns.
>
> --
> Simon Riggs
> 2ndQuadrant http://www.2ndQuadrant.com
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
>
> http://www.postgresql.org/about/donate
>

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2007-10-28 20:45:21 Re: updated hash functions for postgresql v1
Previous Message Luke Lonergan 2007-10-28 20:17:36 Re: updated hash functions for postgresql v1