Re: Consider low startup cost in add_partial_path

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Consider low startup cost in add_partial_path
Date: 2019-09-28 23:21:33
Message-ID: CAAaqYe-jCAvfHTEt64XX0WZuZew+2VUdDvimp99icCU6f8s3mA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Saturday, September 28, 2019, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> On Sat, Sep 28, 2019 at 12:16:05AM -0400, Robert Haas wrote:
>
>> On Fri, Sep 27, 2019 at 2:24 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>>
>>> Over in the incremental sort patch discussion we found [1] a case
>>> where a higher cost plan ends up being chosen because a low startup
>>> cost partial path is ignored in favor of a lower total cost partial
>>> path and a limit is a applied on top of that which would normal favor
>>> the lower startup cost plan.
>>>
>>> 45be99f8cd5d606086e0a458c9c72910ba8a613d originally added
>>> `add_partial_path` with the comment:
>>>
>>> > Neither do we need to consider startup costs:
>>> > parallelism is only used for plans that will be run to completion.
>>> > Therefore, this routine is much simpler than add_path: it needs to
>>> > consider only pathkeys and total cost.
>>>
>>> I'm not entirely sure if that is still true or not--I can't easily
>>> come up with a scenario in which it's not, but I also can't come up
>>> with an inherent reason why such a scenario cannot exist.
>>>
>>
>> I think I just didn't think carefully about the Limit case.
>>
>>
> Thanks! In that case I suggest we treat it as a separate patch/fix,
> independent of the incremental sort patch. I don't want to bury it in
> that patch series, it's already pretty large.
>

Now the trick is to figure out a way to demonstrate it in test :)

Basically we need:
Path A: Can short circuit with LIMIT but has high total cost
Path B: Can’t short circuit with LIMIT but has lower total cost

(Both must be parallel aware of course.)

Maybe ordering in B can be a sort node and A can be an index scan (perhaps
with very high random page cost?) and force choosing a parallel plan?

I’m trying to describe this to jog my thoughts (not in front of my laptop
right now so can’t try it out).

Any other ideas?

James

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-09-29 01:22:28 Re: [DOC] Document concurrent index builds waiting on each other
Previous Message Tom Lane 2019-09-28 23:10:42 Re: Possible bug: SQL function parameter in window frame definition