Re: Optimizer choosing the wrong plan

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Viswanath <M(dot)Viswanath16(at)gmail(dot)com>
Cc: "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Optimizer choosing the wrong plan
Date: 2018-11-26 15:53:24
Message-ID: CAMkU=1w2MDqZOvwqeB56vPdc1b0JOsBwPx3xcA8MvF4vzQLdZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Nov 26, 2018 at 5:11 AM Viswanath <M(dot)Viswanath16(at)gmail(dot)com> wrote:

> *Postgres server version - 9.5.10*
> *RAM - 128 GB*
> *WorkMem 64 MB*
>
> *Problematic query with explain :*
> *Query 1 (original):*
> explain analyse SELECT myTable1.ID FROM myTable1 LEFT JOIN myTable2 ON
> myTable1.ID=myTable2.ID WHERE ((((myTable1.bool_val = true) AND
> (myTable1.small_intval IN (1,2,3))) AND ((*myTable2.bigint_val = 1*) AND
> (myTable1.bool_val = true))) AND (((myTable1.ID >= 1000000000000) AND
> (myTable1.ID <= 1999999999999)) )) ORDER BY 1 DESC , 1 NULLS FIRST LIMIT
> 11;
>

There is no point doing a LEFT JOIN when the NULL-extended rows get
filtered out later.

Also, ordering by the same column twice is peculiar, to say the least.

> The table myTable2 contains *12701952* entries. Out of which only *86227*
> is
> not null and *146* entries are distinct.
>

I assume you mean the column myTable2.ID has that many not null and
distinct?

>
> The above query returns 0 rows since 'myTable2.bigint_val = 1' criteria
> satisfies nothing. It takes 6 seconds for execution as the planner chooses*
> myTable1.ID column's index*.

More importantly, it chooses the index on myTable2.ID. It does also use
the index on myTable1.ID, but that is secondary.

The ideal index for this query would probably be a composite index on
myTable2 (bigint_val, id DESC);
The planner would probably choose to use that index, even if the statistics
are off.

I tried running *vacuum analyse* table many times, tried changing the
> *statistics target of the column to 250 (since there are only 149 distinct
> values)*. But none worked out. The planner thinks that there are *1727*
> rows
> that matches the condition *myTable2.bigint_val = 1* but there are none.
>

It would interesting if you can upgrade a copy of your server to v11 and
try it there. We made changes to ANALYZE in that version which were
intended to improve this situation, and it would be nice to know if it
actually did so for your case.

Also, increasing statistics target even beyond 250 might help. If every
one of the observed value is seen at least twice, it will trigger the
system to assume that it has observed all distinct values that exist. But
if any of the values are seen exactly once, that causes it to take a
different path (the one which got modified in v11).

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jakub Glapa 2018-11-26 16:00:30 Re: dsa_allocate() faliure
Previous Message Justin Pryzby 2018-11-26 15:52:08 Re: dsa_allocate() faliure