Quick Links

Re: Caching query plan costs

From:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To:	Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)anarazel(dot)de>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Caching query plan costs
Date:	2018-09-03 20:36:34
Message-ID:	5e60d1fb-36b9-17b1-adaa-7922c60a4288@2ndquadrant.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 09/03/2018 08:56 PM, Bruce Momjian wrote:
> On Mon, Sep 3, 2018 at 11:42:31AM -0700, Andres Freund wrote:
>> On September 3, 2018 11:33:35 AM PDT, Bruce Momjian <bruce(at)momjian(dot)us>
>> wrote:
>>> On Mon, Sep 3, 2018 at 01:30:33PM -0400, Tom Lane wrote:
>>>> Bruce Momjian <bruce(at)momjian(dot)us> writes:
>>>>> What if we globally or locally cache the _cost_ of plans, so we
>>>>> can consult that cache before planning and enable certain
>>> optimizations?
>>>>
>>>> But what would you use as cache key? And how's this help if we
>>> haven't
>>>
>>> Uh, I assume we would do what pg_stat_statements does and remove the
>>> constants an hash that.
>>
>> That's not particularly cheap... Constants heavily influence planning
>> choices, so I don't think they actually could be removed.
>
> Oh.
>

Yeah, it doesn't really tell you the cost for the plan, because a single
query string may use vastly different plans for different constants.
Which pretty much is why we have pg_stat_plans.

Imagine a query that matches 99% of the table for one value and 1% for
another one. That's going to produce rather different plans for each
(say, seqscan vs. index scan), with very different costs.

>>>> seen a similar query before in the session?
>>>
>>> Well, if it was global we could use output from another session.
>>>
>>> I guess my point is that this only used to turn on
>>> micro-optimizations and maybe parallelism
>>
>> What kind of micro opts are you thinking of? The cases I remember
>> are more in the vein of doing additional complex optimizations (join
>> removal, transforming ORs into UNION, more complex analysis of
>> predicates...).
>>
>> Parallelism would definitely benefit from earlier knowledge, although
>> I suspect some base rel analysis might be more realistic, because it's
>> far from guaranteed that queries are ever repeated in a similar enough
>> manner.
>
> Yes, no question that we would need something that could detect a
> sufficient percentage of previous queries.
>
>>> and JIT, so it doesn't have to be 100% accurate.
>>
>> JIT decision is done after main planning, so we know the cost.
>
> Well, as I remember, we are considering disabling JIT in PG 11 because
> of the use of fixed costs to trigger it. Could executor information
> help decide to use JIT?
>

Isn't that issue more about what is the right default threshold, rather
than using fixed costs in principle?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Re: Caching query plan costs at 2018-09-03 18:56:28 from Bruce Momjian

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	David Fetter	2018-09-03 20:48:51	Re: CREATE ROUTINE MAPPING
Previous Message	Tomas Vondra	2018-09-03 20:29:18	Re: Online verification of checksums