Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Martin Kováčik <kovacik(at)redbyte(dot)eu>, Michael Lewis <mlewis(at)entrata(dot)com>
Cc: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction
Date: 2019-04-25 19:58:16
Message-ID: 5918b32f-9461-ac2a-77c2-fb03d091e928@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 4/25/19 12:47 PM, Martin Kováčik wrote:
> As my example shows you don't have to import a lot of rows - 1000 is
> enough to make a difference - it all depends on the query. When a
> cartesian product is involved only a few records is enough.
> I think that stats should be MVCC versioned otherwise the planner is
> using wrong statistics and chooses wrong plans.

Then you are looking at moving the choke point to looking up the correct
stats across possibly hundreds/thousands of transactions in flight.

> *Martin Kováčik*
> /CEO/
> *redByte*, s.r.o.
> +421 904 236 791
> kovacik(at)redbyte(dot)eu <mailto:kovacik(at)redbyte(dot)eu>, www.redbyte.eu
> <http://redbyte.eu>
>
>
> On Thu, Apr 25, 2019 at 9:28 PM Michael Lewis <mlewis(at)entrata(dot)com
> <mailto:mlewis(at)entrata(dot)com>> wrote:
>
>
>
> On Thu, Apr 25, 2019, 11:34 AM Martin Kováčik <kovacik(at)redbyte(dot)eu
> <mailto:kovacik(at)redbyte(dot)eu>> wrote:
>
> Turning off autovacuum for the tests is a valid option and I
> will definitely do this as a workaround. Each test pretty much
> starts with empty schema and data for it is generated during the
> run and rolled back at the end. I have a lot of tests and at the
> moment it is not feasible to modify them.
>
> The real workload for the application is different, but there
> are some cases, when we import data from remote web service in a
> transaction do some work with it and then we do a commit. If
> there is an autovacuum during this process I assume there will
> be similar problem regarding planner statistics.
>
>
> Unless you are importing a huge amount of data relative to what is
> already there, it seems likely to be significantly less impactful
> than adding data to a completely empty table. The stats on a table
> with 0 rows and then 5000 rows is going to be night and day, while
> the difference between stats on 100,000 rows and 105,000 is not as
> impactful. Musing here. I expect others will chime in.
>
> Stats are not versioned with MVCC so it would expected that a commit
> in another transaction that is updating stats would influence the
> query plan for another transaction that is active.
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Martin Kováčik 2019-04-25 20:19:33 Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction
Previous Message Piotr Findeisen 2019-04-25 19:52:51 Is _<typename> a supported way to create a column of array type?