Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

From: Martin Kováčik <kovacik(at)redbyte(dot)eu>
To: Michael Lewis <mlewis(at)entrata(dot)com>
Cc: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction
Date: 2019-04-25 17:33:48
Message-ID: CABp97hPDdq9FN9RYjD+w1bHsk+iePFrXHQ09BNxjUz6H3Jk+EA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Turning off autovacuum for the tests is a valid option and I will
definitely do this as a workaround. Each test pretty much starts with empty
schema and data for it is generated during the run and rolled back at the
end. I have a lot of tests and at the moment it is not feasible to modify
them.

The real workload for the application is different, but there are some
cases, when we import data from remote web service in a transaction do some
work with it and then we do a commit. If there is an autovacuum during this
process I assume there will be similar problem regarding planner statistics.

The real problem here is that the statistics that are seen from one
(uncommited) transaction are affected by analyze statement from another
(commited) transaction. I've seen similar behavior in production for other
applications using PostgreSQL too (10.x and 11.x) - 100% CPU consumed by an
otherwise fine select query. I was puzzled by this behavior back then but
now it makes sense.
*Martin Kováčik*
*CEO*
*redByte*, s.r.o.
+421 904 236 791
kovacik(at)redbyte(dot)eu, www.redbyte.eu <http://redbyte.eu>

On Thu, Apr 25, 2019 at 5:26 PM Michael Lewis <mlewis(at)entrata(dot)com> wrote:

> I assume it is in the documentation, but I am not aware of how stats are
> handled for uncommitted work. Obviously in the example you provided the
> table would be empty, but in your real tests do they start out empty? Would
> it suffice to use temp tables created like the regular ones and analyze
> after insert to ensure stats are up to date? Or would it make sense to turn
> off auto-vacuum/analyze during these tests? The workload is not expected to
> be similar to real world I would think, and so you don't need the vacuum
> and the stats update perhaps.
>
>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Lewis 2019-04-25 19:28:08 Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction
Previous Message Michael Lewis 2019-04-25 15:25:34 Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction