Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Date: 2020-04-28 13:43:43
Message-ID: CAFj8pRD5Mg_pmi35SDPx9SdPYED85xGHNLP+rbW8yLC+0AT0bw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

út 28. 4. 2020 v 15:26 odesílatel Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
napsal:

> On Tue, Apr 28, 2020 at 08:39:18AM -0400, James Coleman wrote:
> >On Tue, Apr 28, 2020 at 8:25 AM Tomas Vondra
> ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >>
> >> On Mon, Apr 27, 2020 at 09:04:09PM -0400, James Coleman wrote:
> >> >On Sun, Apr 26, 2020 at 7:41 PM James Coleman <jtc331(at)gmail(dot)com>
> wrote:
> >> >>
> >> >> On Sun, Apr 26, 2020 at 4:49 PM Tomas Vondra
> >> >> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >> >> >
> >> >> > On Sun, Apr 26, 2020 at 02:46:19PM -0400, James Coleman wrote:
> >> >> > >On Sat, Apr 25, 2020 at 8:31 PM Tomas Vondra
> >> >> > ><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >> >> > >>
> >> >> > >> On Sat, Apr 25, 2020 at 06:47:41PM -0400, James Coleman wrote:
> >> >> > >> >On Sat, Apr 25, 2020 at 5:41 PM David Rowley <
> dgrowleyml(at)gmail(dot)com> wrote:
> >> >> > >> >>
> >> >> > >> >> On Sun, 26 Apr 2020 at 00:40, Tomas Vondra <
> tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >> >> > >> >> > This reminds me our attempts to add bloom filters to hash
> joins, which I
> >> >> > >> >> > think ran into mostly the same challenge of deciding when
> the bloom
> >> >> > >> >> > filter can be useful and is worth the extra work.
> >> >> > >> >>
> >> >> > >> >> Speaking of that, it would be interesting to see how a test
> where you
> >> >> > >> >> write the query as IN(VALUES(...)) instead of IN() compares.
> It would
> >> >> > >> >> be interesting to know if the planner is able to make a more
> suitable
> >> >> > >> >> choice and also to see how all the work over the years to
> improve Hash
> >> >> > >> >> Joins compares to the bsearch with and without the bloom
> filter.
> >> >> > >> >
> >> >> > >> >It would be interesting.
> >> >> > >> >
> >> >> > >> >It also makes one wonder about optimizing these into to hash
> >> >> > >> >joins...which I'd thought about over at [1]. I think it'd be a
> very
> >> >> > >> >significant effort though.
> >> >> > >> >
> >> >> > >>
> >> >> > >> I modified the script to also do the join version of the query.
> I can
> >> >> > >> only run it on my laptop at the moment, so the results may be a
> bit
> >> >> > >> different from those I shared before, but it's interesting I
> think.
> >> >> > >>
> >> >> > >> In most cases it's comparable to the binsearch/bloom approach,
> and in
> >> >> > >> some cases it actually beats them quite significantly. It seems
> to
> >> >> > >> depend on how expensive the comparison is - for "int" the
> comparison is
> >> >> > >> very cheap and there's almost no difference. For "text" the
> comparison
> >> >> > >> is much more expensive, and there are significant speedups.
> >> >> > >>
> >> >> > >> For example the test with 100k lookups in array of 10k elements
> and 10%
> >> >> > >> match probability, the timings are these
> >> >> > >>
> >> >> > >> master: 62362 ms
> >> >> > >> binsearch: 201 ms
> >> >> > >> bloom: 65 ms
> >> >> > >> hashjoin: 36 ms
> >> >> > >>
> >> >> > >> I do think the explanation is fairly simple - the bloom filter
> >> >> > >> eliminates about 90% of the expensive comparisons, so it's 20ms
> plus
> >> >> > >> some overhead to build and check the bits. The hash join
> probably
> >> >> > >> eliminates a lot of the remaining comparisons, because the hash
> table
> >> >> > >> is sized to have one tuple per bucket.
> >> >> > >>
> >> >> > >> Note: I also don't claim the PoC has the most efficient bloom
> filter
> >> >> > >> implementation possible. I'm sure it could be made faster.
> >> >> > >>
> >> >> > >> Anyway, I'm not sure transforming this to a hash join is worth
> the
> >> >> > >> effort - I agree that seems quite complex. But perhaps this
> suggest we
> >> >> > >> should not be doing binary search and instead just build a
> simple hash
> >> >> > >> table - that seems much simpler, and it'll probably give us
> about the
> >> >> > >> same benefits.
> >> >> > >
> >> >> > >That's actually what I originally thought about doing, but I chose
> >> >> > >binary search since it seemed a lot easier to get off the ground.
> >> >> > >
> >> >> >
> >> >> > OK, that makes perfect sense.
> >> >> >
> >> >> > >If we instead build a hash is there anything else we need to be
> >> >> > >concerned about? For example, work mem? I suppose for the binary
> >> >> > >search we already have to expand the array, so perhaps it's not
> all
> >> >> > >that meaningful relative to that...
> >> >> > >
> >> >> >
> >> >> > I don't think we need to be particularly concerned about work_mem.
> We
> >> >> > don't care about it now, and it's not clear to me what we could do
> about
> >> >> > it - we already have the array in memory anyway, so it's a bit
> futile.
> >> >> > Furthermore, if we need to care about it, it probably applies to
> the
> >> >> > binary search too.
> >> >> >
> >> >> > >I was looking earlier at what our standard hash implementation
> was,
> >> >> > >and it seemed less obvious what was needed to set that up (so
> binary
> >> >> > >search seemed a faster proof of concept). If you happen to have
> any
> >> >> > >pointers to similar usages I should look at, please let me know.
> >> >> > >
> >> >> >
> >> >> > I think the hash join implementation is far too complicated. It
> has to
> >> >> > care about work_mem, so it implements batching, etc. That's a lot
> of
> >> >> > complexity we don't need here. IMO we could use either the usual
> >> >> > dynahash, or maybe even the simpler simplehash.
> >> >> >
> >> >> > FWIW it'd be good to verify the numbers I shared, i.e. checking
> that the
> >> >> > benchmarks makes sense and running it independently. I'm not aware
> of
> >> >> > any issues but it was done late at night and only ran on my laptop.
> >> >>
> >> >> Some quick calculations (don't have the scripting in a form I can
> >> >> attach yet; using this as an opportunity to hack on a genericized
> >> >> performance testing framework of sorts) suggest your results are
> >> >> correct. I was also testing on my laptop, but I showed 1.) roughly
> >> >> equivalent results for IN (VALUES ...) and IN (<list>) for integers,
> >> >> but when I switch to (short; average 3 characters long) text values I
> >> >> show the hash join on VALUES is twice as fast as the binary search.
> >> >>
> >> >> Given that, I'm planning to implement this as a hash lookup and share
> >> >> a revised patch.
> >> >
> >> >I've attached a patch series as before, but with an additional patch
> >> >that switches to using dynahash instead of binary search.
> >> >
> >>
> >> OK. I can't take closer look at the moment, I'll check later.
> >>
> >> Any particular reasons to pick dynahash over simplehash? ISTM we're
> >> using simplehash elsewhere in the executor (grouping, tidbitmap, ...),
> >> while there are not many places using dynahash for simple short-lived
> >> hash tables. Of course, that alone is a weak reason to insist on using
> >> simplehash here, but I suppose there were reasons for not using dynahash
> >> and we'll end up facing the same issues here.
> >
> >No particular reason; it wasn't clear to me that there was a reason to
> >prefer one or the other (and I'm not acquainted with the codebase
> >enough to know the differences), so I chose dynahash because it was
> >easier to find examples to follow for implementation.
> >
>
> OK, understood.
>
> >> >Whereas before the benchmarking ended up with a trimodal distribution
> >> >(i.e., master with IN <list>, patch with IN <list>, and either with IN
> >> >VALUES), the hash implementation brings us back to an effectively
> >> >bimodal distribution -- though the hash scalar array op expression
> >> >implementation for text is about 5% faster than the hash join.
> >> >
> >>
> >> Nice. I'm not surprised this is a bit faster than hash join, which has
> >> to worry about additional stuff.
> >>
> >> >Current outstanding thoughts (besides comment/naming cleanup):
> >> >
> >> >- The saop costing needs to be updated to match, as Tomas pointed out.
> >> >
> >> >- Should we be concerned about single execution cases? For example, is
> >> >the regression of speed on a simple SELECT x IN something we should
> >> >try to defeat by only kicking in the optimization if we execute in a
> >> >loop at least twice? That might be of particular interest to pl/pgsql.
> >> >
> >>
> >> I don't follow. How is this related to the number of executions and/or
> >> plpgsql? I suspect you might be talking about prepared statements, but
> >> surely the hash table is built for each execution anyway, even if the
> >> plan is reused, right?
> >>
> >> I think the issue we've been talking about is considering the number of
> >> lookups we expect to do in the array/hash table. But that has nothing to
> >> do with plpgsql and/or multiple executions ...
> >
> >Suppose I do "SELECT 1 IN <100 item list>" (whether just as a
> >standalone query or in pl/pgsql). Then it doesn't make sense to use
> >the optimization, because it can't possibly win over a naive linear
> >scan through the array since to build the hash we have to do that
> >linear scan anyway (I suppose in theory the hashing function could be
> >dramatically faster than the equality op, so maybe it could win
> >overall, but it seems unlikely to me).
>
> True. I'm sure we could construct such cases, but this is hardly the
> only place where that would be an issue. We could create some rough
> costing model and make a decision based on that.
>
> >I'm not so concerned about this in any query where we have a real FROM
> >clause because even if we end up with only one row, the relative
> >penalty is low, and the potential gain is very high. But simple
> >expressions in pl/pgsql, for example, are a case where we can know for
> >certain (correct me if I've wrong on this) that we'll only execute the
> >expression once, which means there's probably always a penalty for
> >choosing the implementation with setup costs over the default linear
> >scan through the array.
> >
>
> What do you mean by "simple expressions"? I'm not plpgsql expert and I
> see it mostly as a way to glue together SQL queries, but yeah - if we
> know a given ScalarArrayOpExpr will only be executed once, then we can
> disable this optimization for now.
>

a := a + 1

is translated to

SELECT $1 + 1 and save result to var a

The queries like this "SELECT $1 + 1" are simple expressions. They are
evaluated just on executor level - it skip SPI

the simple expression has not FROM clause, and have to return just one row.
I am not sure if it is required, it has to return just one column.

I am not sure if executor knows so expression is executed as simply
expressions. But probably it can be deduced from context

Pavel

> >> >- Should we have a test for an operator with a non-strict function?
> >> >I'm not aware of any built-in ops that have that characteristic;
> >> >would you suggest just creating a fake one for the test?
> >> >
> >>
> >> Dunno, I haven't thought about this very much. In general I think the
> >> new code should simply behave the same as the current code, i.e. if
> >> it does not check for strictness, we don't need either.
> >
> >Both the original implementation and this optimized version have the
> >following short circuit condition:
> >
> >if (fcinfo->args[0].isnull && strictfunc)
> >
> >But I'm not sure there are any existing tests to show that the "&&
> >strictfunc" is required.
> >
>
> Ah! You mean a regression test, not a test in the "if condition" sense.
> I don't see a reason not to have such test, although it's probably not
> something we should require from this patch.
>
>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jesse Zhang 2020-04-28 14:15:49 Re: Fix compilation failure against LLVM 11
Previous Message Robert Haas 2020-04-28 13:31:15 Re: Parallel Append can break run-time partition pruning