Re: Consider low startup cost in add_partial_path

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Consider low startup cost in add_partial_path
Date: 2019-10-02 14:22:17
Message-ID: CAAaqYe_LUCw759Cvx_v1S1SrOOK+GCXWHirYqGAyTCyZurowNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 28, 2019 at 7:21 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> Now the trick is to figure out a way to demonstrate it in test :)
>
> Basically we need:
> Path A: Can short circuit with LIMIT but has high total cost
> Path B: Can’t short circuit with LIMIT but has lower total cost
>
> (Both must be parallel aware of course.)

I'm adding one requirement, or clarifying it anyway: the above paths
must be partial paths, and can't just apply at the top level of the
parallel part of the plan. I.e., the lower startup cost has to matter
at a subtree of the parallel portion of the plan.

> Maybe ordering in B can be a sort node and A can be an index scan (perhaps with very high random page cost?) and force choosing a parallel plan?
>
> I’m trying to describe this to jog my thoughts (not in front of my laptop right now so can’t try it out).
>
> Any other ideas?

I've been playing with this a good bit, and I'm struggling to come up
with a test case. Because the issue only manifests in a subtree of the
parallel portion of the plan, a scan on a single relation won't do.
Merge join seems like a good area to look at because it requires
ordering, and that ordering can be either the result of an index scan
(short-circuit-able) or an explicit sort (not short-circuit-able). But
I've been unable to make that result in any different plans with
either 2 or 3 relations joined together, ordered, and a limit applied.

In all cases I've been starting with:

set enable_hashjoin = off;
set enable_nestloop = off;
set max_parallel_workers_per_gather = 4;
set min_parallel_index_scan_size = 0;
set min_parallel_table_scan_size = 0;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;

I've also tried various combinations of random_page_cost,
cpu_index_tuple_cost, cpu_tuple_cost.

Interestingly I've noticed plans joining two relations that look like:

Limit
-> Merge Join
Merge Cond: (t1.pk = t2.pk)
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using t_pkey on t t1
-> Gather Merge
Workers Planned: 4
-> Parallel Index Scan using t_pkey on t t2

Where I would have expected a Gather Merge above a parallelized merge
join. Is that reasonable to expect?

If there doesn't seem to be an obvious way to reproduce the issue
currently, but we know we have a reproduction example along with
incremental sort, what is the path forward for this? Is it reasonable
to try to commit it anyway knowing that it's a "correct" change and
been demonstrated elsewhere?

James

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyle Evans 2019-10-02 14:38:20 Re: My buildfarm member now giving permission denied
Previous Message Tom Lane 2019-10-02 14:11:52 Re: Is querying SPITupleTable with SQL possible?