Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Date: 2020-04-23 12:47:00
Message-ID: 20200423124700.xclyoq7qxtlvjvxs@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 20, 2020 at 09:27:34PM -0400, James Coleman wrote:
>Over in "execExprInterp() questions / How to improve scalar array op
>expr eval?" [1] I'd mused about how we might be able to optimized
>scalar array ops with OR'd semantics.
>
>This patch implements a binary search for such expressions when the
>array argument is a constant so that we can avoid needing to teach
>expression execution to cache stable values or know when a param has
>changed.
>
>The speed-up for the target case can pretty impressive: in my
>admittedly contrived and relatively unscientific test with a query in
>the form:
>
>select count(*) from generate_series(1,100000) n(i) where i in (<1000
>random integers in the series>)
>
>shows ~30ms for the patch versus ~640ms on master.
>

Nice improvement, although 1000 items is probably a bit unusual. The
threshold used in the patch (9 elements) seems a bit too low - what
results have you seen with smaller arrays?

Another idea - would a bloom filter be useful here, as a second
optimization? That is, for large arrays build s small bloom filter,
allowing us to skip even the binary search.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-04-23 12:57:39 Re: backup manifests
Previous Message Stephen Frost 2020-04-23 12:40:47 Re: More efficient RI checks - take 2