Re: Re: Planner chooses slow index heap scan despite accurate row estimates

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Jake Magner <jakemagner90(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Re: Planner chooses slow index heap scan despite accurate row estimates
Date: 2016-05-30 19:34:35
Message-ID: CAMkU=1w80Cz9h3er1gdaDi4JTzX-7D6qfrCp6th41B936OAM9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, May 28, 2016 at 5:38 PM, Jake Magner <jakemagner90(at)gmail(dot)com> wrote:
> Tom Lane-2 wrote
>> Jake Magner &lt;
>
>> jakemagner90@
>
>> &gt; writes:
>>> I tried without doing an INSERT at all, just running the SELECT queries
>>> and
>>> the result is the same. Nested loop is chosen but is much slower.
>>
>> FWIW, I just noticed that the comparisons you're using are plain equality
>> of the arrays. While a GIN array index supports that, it's not exactly
>> its strong suit: the sort of questions that index type supports well are
>> more like "which arrays contain value X?". I wonder if it'd be worth
>> creating btree indexes on the array column.
>
> I added btree indexes and now the nested loop uses those and is a bit faster
> than the hash join. So the planner just misestimates the cost of doing the
> equality comparisons?

I wonder how it would do in 9.4? Either in them actually being
faster, or the planner doing
a better job of realizing they won't be fast.

> I'd prefer not to add more indexes, the hash join
> performance is fast enough if it would just choose that but I'm reluctant to
> turn off nested loops in case the table gets a lot bigger.

A large hash join just needs to divide it up into batches. It should
still be faster than the nested loop (as currently implemented) ,
until you run out of temp space.

But, you already have a solution in hand. I agree you shouldn't add
more indexes without reason, but you do have a reason.

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2016-05-30 20:05:41 Re: similarity and operator '%'
Previous Message David G. Johnston 2016-05-30 18:20:33 Re: similarity and operator '%'