Re: On disable_cost

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Zhenghua Lyu <zlv(at)pivotal(dot)io>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: On disable_cost
Date: 2019-11-01 16:56:30
Message-ID: CA+TgmoY+Ltw7B=1FSFSN4yHcu2roWrz-ijBovj-99LZU=9h1dA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 1, 2019 at 12:43 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> Hm. That seems complicated. Is it clear that we'd always notice that we
> have no plan early enough to know which paths to reconsider? I think
> there's cases where that'd only happen a few levels up.

Yeah, there could be problems of that kind. I think if a baserel has
no paths, then we know right away that we've got a problem, but for
joinrels it might be more complicated.

> As a first step I'd be inclined to "just" adjust disable_cost up to
> something like 1.0e12. Unfortunately much higher and and we're getting
> into the area where the loss of precision starts to be significant
> enough that I'm not sure that we're always careful enough to perform
> math in the right order (e.g. 1.0e16 + 1 being 1.0e16, and 1e+20 + 1000
> being 1e+20). I've seen queries with costs above 1e10 where that costing
> wasn't insane.

We've done that before and we can do it again. But we're going to need
to have something better eventually, I think, not just keep kicking
the can down the road.

Another point to consider here is that in some cases we could really
just skip generating certain paths altogether. We already do this for
hash joins: if we're planning a join and enable_hashjoin is disabled,
we just don't generate hash joins paths at all, except for full joins,
where there might be no other legal method. As this example shows,
this cannot be applied in all cases, but maybe we could do it more
widely than we do today. I'm not sure how beneficial that technique
would be, though, because it doesn't seem like it's quite enough to
solve this problem by itself.

Yet another approach would be to divide the cost into two parts, a
"cost" component and a "violations" component. If two paths are
compared, the one with fewer violations always wins; if it's a tie,
they compare on cost. A path's violation count is the total of its
children, plus one for itself if it does something that's disabled.
This would be more principled than the current approach, but maybe
it's too costly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2019-11-01 16:58:43 Re: update ALTER TABLE with ATTACH PARTITION lock mode (docs)
Previous Message Onder Kalaci 2019-11-01 16:43:44 Re: Looking for a demo of extensible nodes