Quick Links

Re: Slow "not in array" operation

From:	Marco Colli <collimarco91(at)gmail(dot)com>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	Michael Lewis <mlewis(at)entrata(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject:	Re: Slow "not in array" operation
Date:	2019-11-13 16:15:43
Message-ID:	CAFvCgN7khSvn_pZim3PQ9E5cC5hq4AOnFV5j110EiBn7et=24Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Wow! Thank you very much Jeff!! I am really grateful.

Thanks to the btree (instead of gin) the query now takes about 500ms
instead of 70s.

Il Mer 13 Nov 2019, 13:18 Jeff Janes <jeff(dot)janes(at)gmail(dot)com> ha scritto:

> On Wed, Nov 13, 2019 at 6:56 AM Marco Colli <collimarco91(at)gmail(dot)com>
> wrote:
>
>> > the answer is that is because it is a GIN index. Make the same index
>> only as btree, and you should get good performance as it can filter the
>> tags within a given project without visiting the table.
>>
>> Currently I have this GIN index:
>> "index_subscriptions_on_project_id_and_tags" gin (project_id, tags)
>> WHERE trashed_at IS NULL
>>
>>
> Multicolumn GIN indexes are nearly worthless IMO when one column is a
> scalar. You can use this index, but it won't be better than one just on
> "GIN (tags) trashed_at IS NULL". An N-column GIN index is mostly the same
> thing as N single column GIN indexes.
>
>
>> It uses the btree_gin extension and works perfectly for tag search,
>> except for the "NOT" operator. I don't understand why it doesn't use the
>> GIN index also for the "NOT" operator.
>>
>
> Because it can't. Tom already did a good job of describing that. Can you
> describe what steps you think an index should take to jump to the specific
> rows which fail to exist in an inverted index?
>
>
> The problem is that I cannot create the same index with BTree, because PG
>> doesn't support BTree on array :(
>>
>
> Sure it does. It can't jump to specific parts of the index based on the
> array containment operators, but it can use them for in-index filtering
> (but only if you can do an index-only scan). And really, that is probably
> all you need to get > 100x improvement.
>
> Are you getting an error when you try to build it? If so, what is the
> error?
>
> Cheers,
>
> Jeff
>
>>

In response to

Re: Slow "not in array" operation at 2019-11-13 12:18:21 from Jeff Janes

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Luís Roberto Weck	2019-11-13 20:16:44	Parallel Query
Previous Message	Jeff Janes	2019-11-13 12:18:21	Re: Slow "not in array" operation