Re: Avoid full GIN index scan when possible

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Marc Cousin <cousinmarc(at)gmail(dot)com>
Subject: Re: Avoid full GIN index scan when possible
Date: 2020-01-14 22:47:30
Message-ID: CAPpHfdt8vCETJy9xMAvv1QQbOnNUfMfY8-pg+XGqJ-VWq4TNmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Tue, Jan 14, 2020 at 9:43 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> writes:
> > Updated patch is attached. It contains more comments as well as commit message.
>
> I reviewed this a little bit. I agree this seems way more straightforward
> than the patches we've been considering so far. I wasn't too happy with
> the state of the comments, so I rewrote them a bit in the attached v13.

Thank you!

> One thing I'm still not happy about is the comment in
> collectMatchesForHeapRow. v12 failed to touch that at all, so I tried to
> fill it in, but I'm not sure if my explanation is good.

I've tried to rephrase this comment making it better from my point of
view. It's hard for me to be sure about this, since I'm not native
English speaker. I'd like you to take a look on it.

> Also, if we know
> that excludeOnly keys are going to be ignored, can we save any work in
> the main loop of that function?

It doesn't look so for me. We still need to collect matches for
consistent function call afterwards. We may skip calling consistent
function for excludeOnly keys by forcing a recheck. But that's not
going to be a plain win.

I thought about different optimization. We now check for at least one
matching entry. Can we check for at least one *required* entry? It
seems we can save some consistent function calls.

> The test cases needed some work too. Notably, some of the places where
> you tried to use EXPLAIN ANALYZE are unportable because they expose "Heap
> Blocks" counts that are not stable. (I checked the patch on a 32-bit
> machine and indeed some of these failed.) While it'd be possible to work
> around that by filtering the EXPLAIN output, that would not be any simpler
> or faster than our traditional style of just doing a plain EXPLAIN and a
> separate execution.

Thanks!

> It troubles me a bit as well that the test cases don't really expose
> any difference between patched and unpatched code --- I checked, and
> they "passed" without applying any of the code changes. Maybe there's
> not much to be done about that, since after all this is an optimization
> that's not supposed to change any query results.

Yep, it seems like we can't do much in this field unless we're going
to expose too much internals at user level.

I also had concerns about how excludeOnly keys work with lossy pages.
I didn't find exact error. But I've added code, which skips
excludeOnly keys checks for lossy pages. They aren't going to exclude
any lossy page anyway. So, we can save some resources by skipping
this.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Avoid-GIN-full-scan-for-empty-ALL-keys-v14.patch application/octet-stream 32.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2020-01-14 22:50:32 Re: Unicode escapes with any backend encoding
Previous Message Andres Freund 2020-01-14 22:45:20 Re: Why is pq_begintypsend so slow?