Re: Why is GIN index slowing down my query?

From: Marc Mamin <M(dot)Mamin(at)intershop(dot)de>
To: AlexK987 <alex(dot)cue(dot)987(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Why is GIN index slowing down my query?
Date: 2015-02-02 09:17:17
Message-ID: B6F6FD62F2624C4C9916AC0175D56D8828B5A03B@jenmbs01.ad.intershop.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

AlexK987 <alex(dot)cue(dot)987(at)gmail(dot)com> writes:
>> I've created a GIN index on an INT[] column, but it slows down the selects.
>> Here is my table:
>
>> create table talent(person_id INT NOT NULL,
>> skills INT[] NOT NULL);
>
>> insert into talent(person_id, skills)
>> select generate_series, array[0, 1] || generate_series
>> from generate_series(3, 1048575);
>
>> create index talent_skills on talent using gin(skills);
>
>> analyze talent;
>
>> Here is my select:
>
>> explain analyze
>> select * from talent
>> where skills <@ array[1, 15]
>
>Well, that's pretty much going to suck given that data distribution.
>Since "1" is a member of every last entry, the GIN scan will end up
>examining every entry, and then rejecting all of them as not being
>true subsets of [1,15].

This is equivalent and fast:

explain analyze
WITH rare AS (
select * from talent
where skills @> array[15])
select * from rare
where skills @> array[1]
-- (with changed operator)

You might variate your query according to an additional table that keeps the occurrence count of all skills.
Not really pretty though.

regards,

Marc Mamin

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Marc Mamin 2015-02-02 10:31:10 Re: Why is GIN index slowing down my query?
Previous Message Christian Weyer 2015-02-02 06:31:24 Re: Unexpected (bad) performance when querying indexed JSONB column