Re: speeding up planning with partitions

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Floris Van Nee <florisvannee(at)Optiver(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Imai Yoshikazu <yoshikazu_i443(at)live(dot)jp>, "jesper(dot)pedersen(at)redhat(dot)com" <jesper(dot)pedersen(at)redhat(dot)com>, "Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speeding up planning with partitions
Date: 2019-04-05 10:07:17
Message-ID: 0d27e0c2-0652-63e5-5ce2-5d8a399277d7@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019/04/05 18:13, Floris Van Nee wrote:
> One unrelated thing I noticed (but I'm not sure if it's worth a separate email thread) is that the changed default of jit=on in v12 doesn't work very well with a large number of partitions at run-time, for which a large number get excluded at run-time. A query that has an estimated cost above jit_optimize_above_cost takes about 30 seconds to run (for a table with 1000 partitions), because JIT is optimizing the full plan. Without JIT it's barely 20ms (+400ms planning). I can give more details in a separate thread if it's deemed interesting.
>
> Planning Time: 411.321 ms
> JIT:
> Functions: 5005
> Options: Inlining false, Optimization true, Expressions true, Deforming true
> Timing: Generation 721.472 ms, Inlining 0.000 ms, Optimization 16312.195 ms, Emission 12533.611 ms, Total 29567.278 ms

I've noticed a similar problem but in the context of interaction with
parallel query mechanism. The planner, seeing that all partitions will be
scanned (after failing to prune with clauses containing CURRENT_TIMESTAMP
etc.), prepares a parallel plan (containing Parallel Append in this case).
As you can imagine, parallel query initialization (Gather+workers) will
take large amount of time relative to the time it will take to scan the
partitions that remain after pruning (often just one).

The problem in this case is that the planner is oblivious to the
possibility of partition pruning occurring during execution, which may be
common to both parallel query and JIT. If it wasn't oblivious, it
would've set the cost of pruning-capable Append such that parallel query
and/or JIT won't be invoked. We are going to have to fix that sooner or
later.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2019-04-05 10:13:29 Re: Google Summer of Code: question about GiST API advancement project
Previous Message Amit Langote 2019-04-05 09:46:55 Re: Problem with default partition pruning