Re: Performance degradation in TPC-H Q18

From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Performance degradation in TPC-H Q18
Date: 2017-03-03 05:53:00
Message-ID: CAGz5QCJzQdE3SPitB4vGNo-zE63VLVyr-1QwfPsGqX-1qgiopQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 3, 2017 at 8:41 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Mar 3, 2017 at 1:22 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> the resulting hash-values aren't actually meaningfully influenced by the
>> IV. Because we just xor with the IV, most hash-value that without the IV
>> would have fallen into a single hash-bucket, fall into a single
>> hash-bucket afterwards as well; just somewhere else in the hash-range.
>
> Wow, OK. I had kind of assumed (without looking) that setting the
> hash IV did something a little more useful than that. Maybe we should
> do something like struct blah { int iv; int hv; }; newhv =
> hash_any(&blah, sizeof(blah)).
>
Sounds good. I've seen a post from Thomas Munro suggesting some
alternative approach for combining hash values in execGrouping.c[1].

>> In addition to that it seems quite worthwhile to provide an iterator
>> that's not vulnerable to this. An approach that I am, seemingly
>> successfully, testing is to iterate the hashtable in multiple (in my
>> case 23, because why not) passes, accessing only every nth element. That
>> allows the data to be inserted in a lot less "dense" fashion. But
>> that's more an optimization, so I'll just push something like the patch
>> mentioned in the thread already.
>>
>> Makes some sense?
>
> Yep.
>
Yes, it makes sense. Quadratic probing is another approach, but it
would require an extra shift op every time we want to find the next or
prev location during a collision.

[1] https://www.postgresql.org/message-id/CAEepm%3D3rdgjfxW4cKvJ0OEmya2-34B0qHNG1xV0vK7TGPJGMUQ@mail.gmail.com
--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-03-03 06:18:43 Re: Patch to implement pg_current_logfile() function
Previous Message Michael Paquier 2017-03-03 05:43:25 Re: SCRAM authentication, take three