Re: Query with high planning time at version 11.1 compared versions 10.5 and 11.0

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Sanyo Moura <sanyo(dot)moura(at)tatic(dot)net>, jeff(dot)janes(at)gmail(dot)com, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Query with high planning time at version 11.1 compared versions 10.5 and 11.0
Date: 2018-12-06 04:50:39
Message-ID: d74a1e08-f41a-922b-1531-884c48bca8bd@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Hi,

On 2018/12/05 6:55, Alvaro Herrera wrote:
> I noticed another interesting thing, which is that if I modify the query
> to actually reference some partition that I do have (as opposed to the
> above, which just takes 30s to prune everything) the plan is mighty
> curious ... if only because in one of the Append nodes, partitions have
> not been pruned as they should.
>
> So, at least two bugs here,
> 1. the equivalence-class related slowness,
> 2. the lack of pruning
>
> QUERY PLAN
> ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
> Hash Join (cost=1159.13..25423.65 rows=1 width=24)
> Hash Cond: (abs((p.plusalesprice - p_875.plusalesprice)) = (max(abs((p_877.plusalesprice - p_879.plusalesprice)))))
> -> Nested Loop (cost=1000.00..25264.52 rows=1 width=20)
> Join Filter: ((p.loccd = p_875.loccd) AND (p.fecha = p_875.fecha))
> -> Gather (cost=1000.00..25154.38 rows=875 width=16)
> Workers Planned: 2
> -> Parallel Append (cost=0.00..24066.88 rows=875 width=16)
> -> Parallel Seq Scan on precio_125 p (cost=0.00..27.50 rows=1 width=16)
> Filter: ((fecha >= '1990-05-06 00:00:00'::timestamp without time zone) AND (fecha <= '1999-05-07 00:00:00'::timestamp without time zone) AND (pluid = 2))

[ Parallel SeqScan on precio_126 to precio_998 ]

> -> Parallel Seq Scan on precio_999 p_874 (cost=0.00..27.50 rows=1 width=16)
> Filter: ((fecha >= '1990-05-06 00:00:00'::timestamp without time zone) AND (fecha <= '1999-05-07 00:00:00'::timestamp without time zone) AND (pluid = 2))

As you can see from the "Filter: " property above, the baserestrictinfo of
this Append's parent relation is:

BETWEEN '1990-05-06' AND '1999-05-07'

which selects partitions for all days from '1990-05-06' (precio_125) up to
'1992-09-26' (precio_999).

> -> Materialize (cost=0.00..79.52 rows=2 width=16)
> -> Append (cost=0.00..79.51 rows=2 width=16)
> -> Seq Scan on precio_125 p_875 (cost=0.00..39.75 rows=1 width=16)
> Filter: ((fecha >= '1990-05-06 00:00:00'::timestamp without time zone) AND (fecha <= '1990-05-07 00:00:00'::timestamp without time zone) AND (pluid = 2))
> -> Seq Scan on precio_126 p_876 (cost=0.00..39.75 rows=1 width=16)
> Filter: ((fecha >= '1990-05-06 00:00:00'::timestamp without time zone) AND (fecha <= '1990-05-07 00:00:00'::timestamp without time zone) AND (pluid = 2))

Whereas for this Append, it is BETWEEN '1990-05-06' AND '1990-05-07'.

> -> Hash (cost=159.12..159.12 rows=1 width=4)
> -> Aggregate (cost=159.10..159.11 rows=1 width=4)
> -> Nested Loop (cost=0.00..159.10 rows=1 width=8)
> Join Filter: ((p_877.loccd = p_879.loccd) AND (p_877.fecha = p_879.fecha))
> -> Append (cost=0.00..79.51 rows=2 width=16)
> -> Seq Scan on precio_125 p_877 (cost=0.00..39.75 rows=1 width=16)
> Filter: ((fecha >= '1990-05-06 00:00:00'::timestamp without time zone) AND (fecha <= '1990-05-07 00:00:00'::timestamp without time zone) AND (pluid = 2))
> -> Seq Scan on precio_126 p_878 (cost=0.00..39.75 rows=1 width=16)
> Filter: ((fecha >= '1990-05-06 00:00:00'::timestamp without time zone) AND (fecha <= '1990-05-07 00:00:00'::timestamp without time zone) AND (pluid = 2))
> -> Materialize (cost=0.00..79.52 rows=2 width=16)
> -> Append (cost=0.00..79.51 rows=2 width=16)
> -> Seq Scan on precio_125 p_879 (cost=0.00..39.75 rows=1 width=16)
> Filter: ((fecha >= '1990-05-06 00:00:00'::timestamp without time zone) AND (fecha <= '1990-05-07 00:00:00'::timestamp without time zone) AND (pluid = 2))
> -> Seq Scan on precio_126 p_880 (cost=0.00..39.75 rows=1 width=16)
> Filter: ((fecha >= '1990-05-06 00:00:00'::timestamp without time zone) AND (fecha <= '1990-05-07 00:00:00'::timestamp without time zone) AND (pluid = 2))

And also for these two Appends.

So, I don't think there's anything funny going on with pruning here, maybe
just a typo in the query (1999 looks very much like 1990 to miss the typo
maybe.) I fixed the query to change '1999-05-07' to '1990-05-07' of the
first Append's parent relation and I get the following planning time with
the patch I posted above with 2 partitions selected under each Append as
expected.

Planning Time: 536.947 ms
Execution Time: 1.304 ms
(31 rows)

Even without changing 1999 to 1990, the planning time with the patch is:

Planning Time: 4669.685 ms
Execution Time: 110.506 ms
(1777 rows)

Thanks,
Amit

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2018-12-06 04:55:46 Re: Use durable_unlink for .ready and .done files for WAL segment removal
Previous Message Amit Langote 2018-12-06 04:46:34 Re: error message when subscription target is a partitioned table

Browse pgsql-performance by date

  From Date Subject
Next Message Amit Langote 2018-12-06 05:00:22 Re: Query with high planning time at version 11.1 compared versions 10.5 and 11.0
Previous Message Amit Langote 2018-12-06 02:19:24 Re: Query with high planning time at version 11.1 compared versions 10.5 and 11.0