Re: Query became very slow after 9.6 -> 10 upgrade

From: Dmitry Shalashov <skaurus(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Query became very slow after 9.6 -> 10 upgrade
Date: 2017-11-23 14:58:23
Message-ID: CAKPeCUEy6HMm=Kn=V82xjycZf9KYnqzPLVQ8ngjoLXvO97zx4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

We tried to apply the patch on 10.1 source, but something is wrong it seems:

patch -p1 < ../1.patch
(Stripping trailing CRs from patch; use --binary to disable.)
patching file src/backend/optimizer/plan/analyzejoins.c
(Stripping trailing CRs from patch; use --binary to disable.)
patching file src/backend/utils/adt/selfuncs.c
Hunk #1 succeeded at 3270 (offset -91 lines).
Hunk #2 succeeded at 3304 (offset -91 lines).
Hunk #3 succeeded at 3313 (offset -91 lines).
Hunk #4 succeeded at 3393 (offset -91 lines).
patch unexpectedly ends in middle of line
Hunk #5 succeeded at 3570 with fuzz 1 (offset -91 lines).

Dmitry Shalashov, relap.io & surfingbird.ru

2017-11-23 2:07 GMT+03:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:

> Dmitry Shalashov <skaurus(at)gmail(dot)com> writes:
> > Turns out we had not 9.6 but 9.5.
>
> I'd managed to reproduce the weird planner behavior locally in the
> regression database:
>
> regression=# create table foo (f1 int[], f2 int);
> CREATE TABLE
> regression=# explain select * from tenk1 where unique2 in (select distinct
> unnest(f1) from foo where f2=1);
> QUERY PLAN
> ------------------------------------------------------------
> -----------------------
> Nested Loop (cost=30.85..80.50 rows=6 width=244)
> -> HashAggregate (cost=30.57..30.63 rows=6 width=4)
> Group Key: (unnest(foo.f1))
> -> HashAggregate (cost=30.42..30.49 rows=6 width=4)
> Group Key: unnest(foo.f1)
> -> ProjectSet (cost=0.00..28.92 rows=600 width=4)
> -> Seq Scan on foo (cost=0.00..25.88 rows=6
> width=32)
> Filter: (f2 = 1)
> -> Index Scan using tenk1_unique2 on tenk1 (cost=0.29..8.30 rows=1
> width=244)
> Index Cond: (unique2 = (unnest(foo.f1)))
> (10 rows)
>
> Digging into it, the reason for the duplicate HashAggregate step was that
> query_supports_distinctness() punted on SRFs-in-the-targetlist, basically
> on the argument that it wasn't worth extra work to handle that case.
> Thinking a bit harder, it seems to me that the correct analysis is:
> 1. If we are proving distinctness on the grounds of a DISTINCT clause,
> then it doesn't matter whether there are any SRFs, because DISTINCT
> removes duplicates after tlist SRF expansion.
> 2. But tlist SRFs break the ability to prove distinctness on the grounds
> of GROUP BY, unless all of them are within grouping columns.
> It still seems like detecting the second case is harder than it's worth,
> but we can trivially handle the first case, with little more than some
> code rearrangement.
>
> The other problem is that the output rowcount of the sub-select (ie, of
> the HashAggregate) is being estimated as though the SRF weren't there.
> This turns out to be because estimate_num_groups() doesn't consider the
> possibility of SRFs in the grouping columns. It never has, but in 9.6 and
> before the problem was masked by the fact that grouping_planner scaled up
> the result rowcount by tlist_returns_set_rows() *after* performing
> grouping. Now we're effectively doing that in the other order, which is
> more correct, but that means estimate_num_groups() has to apply some sort
> of adjustment. I suggest that it just multiply its old estimate by the
> maximum of the SRF expansion counts. That's likely to be an overestimate,
> but it's really hard to do better without specific knowledge of the
> individual SRF's behavior.
>
> In short, I propose the attached fixes. I've checked this and it seems
> to fix Dmitry's original problem according to the test case he sent
> off-list.
>
> regards, tom lane
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-11-23 15:01:32 documentation is now XML
Previous Message REIX, Tony 2017-11-23 14:27:17 RE:PostgreSLQ v10.1 and xlC compiler on AIX

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2017-11-23 17:00:40 Re: Query became very slow after 9.6 -> 10 upgrade
Previous Message Henrik Cednert (Filmlance) 2017-11-23 09:26:16 Re: pg_dump 3 times as slow after 8.4 -> 9.5 upgrade