Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Date: 2020-04-28 22:22:20
Message-ID: CAAaqYe956a-zbm1qR8pqz=iLbF8LW5vBrQGrzXVHXdLk3at5_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I cc'd Andres given his commit introduced simplehash, so I figured
he'd probably have a few pointers on when each one might be useful.

On Tue, Apr 28, 2020 at 8:39 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
...
> > Any particular reasons to pick dynahash over simplehash? ISTM we're
> > using simplehash elsewhere in the executor (grouping, tidbitmap, ...),
> > while there are not many places using dynahash for simple short-lived
> > hash tables. Of course, that alone is a weak reason to insist on using
> > simplehash here, but I suppose there were reasons for not using dynahash
> > and we'll end up facing the same issues here.
>
> No particular reason; it wasn't clear to me that there was a reason to
> prefer one or the other (and I'm not acquainted with the codebase
> enough to know the differences), so I chose dynahash because it was
> easier to find examples to follow for implementation.

Do you have any thoughts on what the trade-offs/use-cases etc. are for
dynahash versus simple hash?

From reading the commit message in b30d3ea824c it seems like simple
hash is faster and optimized for CPU cache benefits. The comments at
the top of simplehash.h also discourage it's use in non
performance/space sensitive uses, but there isn't anything I can see
that explicitly tries to discuss when dynahash is useful, etc.

Given the performance notes in that commit message, I thinking
switching to simple hash is worth it.

But I also wonder if there might be some value in a README or comments
addition that would be a guide to what the various hash
implementations are useful for. If there's interest, I could try to
type something short up so that we have something to make the code
base a bit more discoverable.

James

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-04-28 22:52:33 Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Previous Message Andres Freund 2020-04-28 22:21:42 Re: More efficient RI checks - take 2