Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

From: Martin Kováčik <kovacik(at)redbyte(dot)eu>
To: Michael Lewis <mlewis(at)entrata(dot)com>
Cc: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction
Date: 2019-04-25 19:47:39
Message-ID: CABp97hNswjPXqoC4nDX2_Jvc3f59BBq+VUjsm95+Yw41KRD87w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

As my example shows you don't have to import a lot of rows - 1000 is enough
to make a difference - it all depends on the query. When a cartesian
product is involved only a few records is enough.
I think that stats should be MVCC versioned otherwise the planner is using
wrong statistics and chooses wrong plans.
*Martin Kováčik*
*CEO*
*redByte*, s.r.o.
+421 904 236 791
kovacik(at)redbyte(dot)eu, www.redbyte.eu <http://redbyte.eu>

On Thu, Apr 25, 2019 at 9:28 PM Michael Lewis <mlewis(at)entrata(dot)com> wrote:

>
>
> On Thu, Apr 25, 2019, 11:34 AM Martin Kováčik <kovacik(at)redbyte(dot)eu> wrote:
>
>> Turning off autovacuum for the tests is a valid option and I will
>> definitely do this as a workaround. Each test pretty much starts with empty
>> schema and data for it is generated during the run and rolled back at the
>> end. I have a lot of tests and at the moment it is not feasible to modify
>> them.
>>
>> The real workload for the application is different, but there are some
>> cases, when we import data from remote web service in a transaction do some
>> work with it and then we do a commit. If there is an autovacuum during this
>> process I assume there will be similar problem regarding planner statistics.
>>
>
> Unless you are importing a huge amount of data relative to what is already
> there, it seems likely to be significantly less impactful than adding data
> to a completely empty table. The stats on a table with 0 rows and then 5000
> rows is going to be night and day, while the difference between stats on
> 100,000 rows and 105,000 is not as impactful. Musing here. I expect others
> will chime in.
>
> Stats are not versioned with MVCC so it would expected that a commit in
> another transaction that is updating stats would influence the query plan
> for another transaction that is active.
>
>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Piotr Findeisen 2019-04-25 19:52:51 Is _<typename> a supported way to create a column of array type?
Previous Message Tom Lane 2019-04-25 19:45:58 Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction