Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Martin Kováčik <kovacik(at)redbyte(dot)eu>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction
Date: 2019-04-25 19:40:40
Message-ID: cd47c11e-5093-3c53-1150-637b964b50ba@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 4/25/19 7:37 AM, Martin Kováčik wrote:
> Hi group,
>

See comments inline below

> To illustrate my situation let's consider my tests look like this:
>
> BEGIN;
>
> -- A: insert data for the test
>
> -- B: i'll refer to this point later
>
> -- C: select(s)
>
> ROLLBACK;
>
> Everything is fine, until autovacuum (analyze) runs when the test is at
> point B. After that the query planner at point C chooses wrong plan and
> the query takes a long time to complete, blocking one CPU core for a
> long time. It seems like the planner statistics inside running
> transaction are affected by analyze task running outside of the
> transaction. In this case after running analyze (outside the
> transaction) when the transaction is at B, causes query planner to think
> there are no rows (because the inserts at point A were not yet committed).
>
> I did prepare a simple test case to reproduce this behavior:
>
> First you need to create a table:
>
> create table a (id bigint primary key);
>
> Then run this transaction:
>
> begin;
> insert into a
> select * from generate_series(1, 1000);
>
> -- during sleep execute analyze on this db in separate connection
> select pg_sleep(10);

analyze a;

On my machine that changes the time from:

29715.763 ms

to

291.765 ms

when running ANALYZE in the concurrent connection during the sleep.

>
> explain analyze select count(*) from (
> select distinct a1, a2, a3, a4 from a a1
> left join a a2 on a1.id <http://a1.id> > 900
> left join a a3 on a2.id <http://a2.id> = a3.id <http://a3.id>
> left join a a4 on a3.id <http://a3.id> = a4.id <http://a4.id>
> left join a a5 on a5.id <http://a5.id> = a4.id <http://a4.id>
> left join a a6 on a6.id <http://a6.id> = a5.id <http://a5.id>
> left join a a7 on a7.id <http://a7.id> = a6.id <http://a6.id>
> left join a a8 on a8.id <http://a8.id> = a7.id <http://a7.id>) temp;
>
> rollback;
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2019-04-25 19:45:58 Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction
Previous Message Michael Lewis 2019-04-25 19:28:08 Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction