Re: Caching query plan costs

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Caching query plan costs
Date: 2018-09-03 20:36:34
Message-ID: 5e60d1fb-36b9-17b1-adaa-7922c60a4288@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/03/2018 08:56 PM, Bruce Momjian wrote:
> On Mon, Sep 3, 2018 at 11:42:31AM -0700, Andres Freund wrote:
>> On September 3, 2018 11:33:35 AM PDT, Bruce Momjian <bruce(at)momjian(dot)us>
>> wrote:
>>> On Mon, Sep 3, 2018 at 01:30:33PM -0400, Tom Lane wrote:
>>>> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>>>>> What if we globally or locally cache the _cost_ of plans, so we
>>>>> can consult that cache before planning and enable certain
>>> optimizations?
>>>>
>>>> But what would you use as cache key? And how's this help if we
>>> haven't
>>>
>>> Uh, I assume we would do what pg_stat_statements does and remove the
>>> constants an hash that.
>>
>> That's not particularly cheap... Constants heavily influence planning
>> choices, so I don't think they actually could be removed.
>
> Oh.
>

Yeah, it doesn't really tell you the cost for the plan, because a single
query string may use vastly different plans for different constants.
Which pretty much is why we have pg_stat_plans.

Imagine a query that matches 99% of the table for one value and 1% for
another one. That's going to produce rather different plans for each
(say, seqscan vs. index scan), with very different costs.

>>>> seen a similar query before in the session?
>>>
>>> Well, if it was global we could use output from another session.
>>>
>>> I guess my point is that this only used to turn on
>>> micro-optimizations and maybe parallelism
>>
>> What kind of micro opts are you thinking of? The cases I remember
>> are more in the vein of doing additional complex optimizations (join
>> removal, transforming ORs into UNION, more complex analysis of
>> predicates...).
>>
>> Parallelism would definitely benefit from earlier knowledge, although
>> I suspect some base rel analysis might be more realistic, because it's
>> far from guaranteed that queries are ever repeated in a similar enough
>> manner.
>
> Yes, no question that we would need something that could detect a
> sufficient percentage of previous queries.
>
>>> and JIT, so it doesn't have to be 100% accurate.
>>
>> JIT decision is done after main planning, so we know the cost.
>
> Well, as I remember, we are considering disabling JIT in PG 11 because
> of the use of fixed costs to trigger it. Could executor information
> help decide to use JIT?
>

Isn't that issue more about what is the right default threshold, rather
than using fixed costs in principle?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2018-09-03 20:48:51 Re: CREATE ROUTINE MAPPING
Previous Message Tomas Vondra 2018-09-03 20:29:18 Re: Online verification of checksums