Re: Regarding performance regression on specific query

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: "Jung, Jinho" <jinho(dot)jung(at)gatech(dot)edu>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Regarding performance regression on specific query
Date: 2018-11-24 20:32:41
Message-ID: 29815.1543091561@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> writes:
> On 2018/11/20 2:49, Jung, Jinho wrote:
>> [ assorted queries ]

> I noticed that these two are fixed by running ANALYZE in the database in
> which these queries are run.

That didn't help much for me. What did help was increasing
join_collapse_limit and from_collapse_limit to not limit the
join search space --- on queries with as many input relations
as these, you're really at the mercy of whether the given query
structure represents a good join order if you don't.

In general I can't get even a little bit excited about the quality of the
plans selected for these examples, as they all involve made-up restriction
and join clauses that the planner isn't going to have the slightest clue
about. The observations boil down to "9.4 made one set of arbitrary plan
choices, while v10 made a different set of arbitrary plan choices, and on
these particular examples 9.4 got lucky and 10 didn't".

Possibly also worth noting is that running these in an empty database
is in itself kind of a worst case, because many of the tables are empty
to start with (or the restriction/join clauses pass no rows), and so
the fastest runtime tends to go to plans of the form "nestloop with
empty relation on the outside and all the expensive stuff on the
inside". (Observe all the "(never executed)" notations in the EXPLAIN
output.) This kind of plan wins only when the outer rel is actually
empty, otherwise it can easily lose big, and therefore PG's planner is
intentionally designed to discount the case entirely. We never believe
that a relation is empty, unless we can mathematically prove that, and
our cost estimates are never made with an eye to exploiting such cases.
This contributes a lot to the random-chance nature of which plan is
actually fastest; the planner isn't expecting "(never executed)" to
happen and doesn't prefer plans that will win if it does happen.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-11-24 20:49:25 Re: RHEL 8.0 build
Previous Message Jeremy Harris 2018-11-24 20:18:18 RHEL 8.0 build