Re: Parameter for planner estimate of recursive queries

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parameter for planner estimate of recursive queries
Date: 2022-03-23 17:36:29
Message-ID: 2770446.1648056989@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Tue, Jan 25, 2022 at 4:44 AM Peter Eisentraut
> <peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
>> On the one hand, this smells like a planner hint. But on the other
>> hand, it doesn't look like we will come up with proper graph-aware
>> selectivity estimation system any time soon, so just having all graph
>> OLTP queries suck until then because the planner hint is hardcoded
>> doesn't seem like a better solution. So I think this setting can be ok.

> I agree. It's a bit lame, but seems pretty harmless, and I can't see
> us realistically doing a lot better with any reasonable amount of
> work.

Yeah, agreed on all counts. The thing that makes it lame is that
there's no reason to expect that the same multiplier is good for
every recursive query done in an installation, or even in a session.

One could imagine dealing with that by adding custom syntax to WITH,
as we have already done once:

WITH RECURSIVE cte1 AS SCALE 1.0 (SELECT ...

But I *really* hesitate to go there, mainly because once we do
something like that we can't ever undo it. I think Simon's
proposal is a reasonable low-effort compromise.

Some nitpicks:

* The new calculation needs clamp_row_est(), since the float
GUC could be fractional or even zero.

* Do we want to prevent the GUC value from being zero? It's not
very sensible, plus I think we might want to reserve that value
to mean "use the built-in calculation", in case we ever do put
in some smarter logic here. But I'm not sure what a reasonable
non-zero lower bound would be.

* The proposed docs claim that a smaller setting works by biasing
the planner towards fast-start plans, but I don't think I believe
that explanation. I'd venture that we want text more along the
lines of "This may help the planner choose the most appropriate
method for joining the work table to the query's other tables".

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-03-23 17:42:03 Re: shared-memory based stats collector - v67
Previous Message Andres Freund 2022-03-23 17:35:37 ubsan