Re: Avoid full GIN index scan when possible

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Marc Cousin <cousinmarc(at)gmail(dot)com>
Subject: Re: Avoid full GIN index scan when possible
Date: 2020-01-14 18:43:11
Message-ID: 20952.1579027391@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> writes:
> Updated patch is attached. It contains more comments as well as commit message.

I reviewed this a little bit. I agree this seems way more straightforward
than the patches we've been considering so far. I wasn't too happy with
the state of the comments, so I rewrote them a bit in the attached v13.

One thing I'm still not happy about is the comment in
collectMatchesForHeapRow. v12 failed to touch that at all, so I tried to
fill it in, but I'm not sure if my explanation is good. Also, if we know
that excludeOnly keys are going to be ignored, can we save any work in
the main loop of that function?

The test cases needed some work too. Notably, some of the places where
you tried to use EXPLAIN ANALYZE are unportable because they expose "Heap
Blocks" counts that are not stable. (I checked the patch on a 32-bit
machine and indeed some of these failed.) While it'd be possible to work
around that by filtering the EXPLAIN output, that would not be any simpler
or faster than our traditional style of just doing a plain EXPLAIN and a
separate execution.

It troubles me a bit as well that the test cases don't really expose
any difference between patched and unpatched code --- I checked, and
they "passed" without applying any of the code changes. Maybe there's
not much to be done about that, since after all this is an optimization
that's not supposed to change any query results.

I didn't repeat any of the performance testing --- it seems fairly
clear that this can't make any cases worse.

Other than the collectMatchesForHeapRow issue, I think this is
committable.

regards, tom lane

Attachment Content-Type Size
0001-Avoid-GIN-full-scan-for-empty-ALL-keys-v13.patch text/x-diff 30.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-01-14 18:46:57 Re: our checks for read-only queries are not great
Previous Message David Fetter 2020-01-14 18:33:12 Re: backup manifests