Re: Avoid full GIN index scan when possible

From: Marc Cousin <cousinmarc(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Avoid full GIN index scan when possible
Date: 2019-07-01 15:56:17
Message-ID: eca8612b-e289-e2fc-c5df-84be46f75ec1@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 29/06/2019 00:23, Julien Rouhaud wrote:
> On Fri, Jun 28, 2019 at 10:16 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>>> On Fri, Jun 28, 2019 at 03:03:19PM -0400, Tom Lane wrote:
>>>> I not only don't want that function in indxpath.c, I don't even want
>>>> it to be known/called from there. If we need the ability for the index
>>>> AM to editorialize on the list of indexable quals (which I'm not very
>>>> convinced of yet), let's make an AM interface function to do it.
>>
>>> Wouldn't it be better to have a function that inspects a single qual and
>>> says whether it's "optimizable" or not? That could be part of the AM
>>> implementation, and we'd call it and it'd be us messing with the list.
>>
>> Uh ... we already determined that the qual is indexable (ie is a member
>> of the index's opclass), or allowed the index AM to derive an indexable
>> clause from it, so I'm not sure what you envision would happen
>> additionally there. If I understand what Julien is concerned about
>> --- and I may not --- it's that the set of indexable clauses *as a whole*
>> may have or lack properties of interest. So I'm thinking the answer
>> involves some callback that can do something to the whole list, not
>> qual-at-a-time. We've already got facilities for the latter case.
>
> Yes, the root issue here is that with gin it's entirely possible that
> "WHERE sometable.col op value1" is way more efficient than "WHERE
> sometable.col op value AND sometable.col op value2", where both qual
> are determined indexable by the opclass. The only way to avoid that
> is indeed to inspect the whole list, as done in this poor POC.
>
> This is a problem actually hit in production, and as far as I know
> there's no easy way from the application POV to prevent unexpected
> slowdown. Maybe Marc will have more details about the actual problem
> and how expensive such a case was compared to the normal ones.

Sorry for the delay...

Yes, quite easily, here is what we had (it's just a bit simplified, we have other criterions but I think it shows the problem):

rh2=> explain analyze select * from account_employee where typeahead like '%albert%';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on account_employee (cost=53.69..136.27 rows=734 width=666) (actual time=15.562..35.044 rows=8957 loops=1)
Recheck Cond: (typeahead ~~ '%albert%'::text)
Rows Removed by Index Recheck: 46
Heap Blocks: exact=8919
-> Bitmap Index Scan on account_employee_site_typeahead_gin_idx (cost=0.00..53.51 rows=734 width=0) (actual time=14.135..14.135 rows=9011 loops=1)
Index Cond: (typeahead ~~ '%albert%'::text)
Planning time: 0.224 ms
Execution time: 35.389 ms
(8 rows)

rh2=> explain analyze select * from account_employee where typeahead like '%albert%' and typeahead like '%lo%';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on account_employee (cost=28358.38..28366.09 rows=67 width=666) (actual time=18210.109..18227.134 rows=1172 loops=1)
Recheck Cond: ((typeahead ~~ '%albert%'::text) AND (typeahead ~~ '%lo%'::text))
Rows Removed by Index Recheck: 7831
Heap Blocks: exact=8919
-> Bitmap Index Scan on account_employee_site_typeahead_gin_idx (cost=0.00..28358.37 rows=67 width=0) (actual time=18204.756..18204.756 rows=9011 loops=1)
Index Cond: ((typeahead ~~ '%albert%'::text) AND (typeahead ~~ '%lo%'::text))
Planning time: 0.288 ms
Execution time: 18230.182 ms
(8 rows)

We noticed this because the application timed out for users searching someone whose name was 2 characters ( it happens :) ).

We reject such filters when it's the only criterion, as we know it's going to be slow, but ignoring it as a supplementary filter would be a bit weird.

Of course there is the possibility of filtering with two stages with a CTE, but that's not as great as having PostgreSQL doing it itself.

By the way, while preparing this, I noticed that it seems that during this kind of index scan, the interrupt signal is masked
for a very long time. Control-C takes a very long while to cancel the query. But it's an entirely different problem :)

Regards

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-07-01 15:57:01 Re: Two pg_rewind patches (auto generate recovery conf and ensure clean shutdown)
Previous Message Alvaro Herrera 2019-07-01 15:40:09 Re: Error message inconsistency