Re: Slow "not in array" operation

From: Marco Colli <collimarco91(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Michael Lewis <mlewis(at)entrata(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Slow "not in array" operation
Date: 2019-11-13 16:15:43
Message-ID: CAFvCgN7khSvn_pZim3PQ9E5cC5hq4AOnFV5j110EiBn7et=24Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Wow! Thank you very much Jeff!! I am really grateful.

Thanks to the btree (instead of gin) the query now takes about 500ms
instead of 70s.

Il Mer 13 Nov 2019, 13:18 Jeff Janes <jeff(dot)janes(at)gmail(dot)com> ha scritto:

> On Wed, Nov 13, 2019 at 6:56 AM Marco Colli <collimarco91(at)gmail(dot)com>
> wrote:
>
>> > the answer is that is because it is a GIN index. Make the same index
>> only as btree, and you should get good performance as it can filter the
>> tags within a given project without visiting the table.
>>
>> Currently I have this GIN index:
>> "index_subscriptions_on_project_id_and_tags" gin (project_id, tags)
>> WHERE trashed_at IS NULL
>>
>>
> Multicolumn GIN indexes are nearly worthless IMO when one column is a
> scalar. You can use this index, but it won't be better than one just on
> "GIN (tags) trashed_at IS NULL". An N-column GIN index is mostly the same
> thing as N single column GIN indexes.
>
>
>> It uses the btree_gin extension and works perfectly for tag search,
>> except for the "NOT" operator. I don't understand why it doesn't use the
>> GIN index also for the "NOT" operator.
>>
>
> Because it can't. Tom already did a good job of describing that. Can you
> describe what steps you think an index should take to jump to the specific
> rows which fail to exist in an inverted index?
>
>
> The problem is that I cannot create the same index with BTree, because PG
>> doesn't support BTree on array :(
>>
>
> Sure it does. It can't jump to specific parts of the index based on the
> array containment operators, but it can use them for in-index filtering
> (but only if you can do an index-only scan). And really, that is probably
> all you need to get > 100x improvement.
>
> Are you getting an error when you try to build it? If so, what is the
> error?
>
> Cheers,
>
> Jeff
>
>>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Luís Roberto Weck 2019-11-13 20:16:44 Parallel Query
Previous Message Jeff Janes 2019-11-13 12:18:21 Re: Slow "not in array" operation