Quick Links

Re: improve the algorithm cached_plan_cost with real planning time?

From:	Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: improve the algorithm cached_plan_cost with real planning time?
Date:	2020-10-15 19:18:19
Message-ID:	CAKU4AWod3-Dn80dvfhoejuWobJqPgYGmB6kkpC792=eHfP2QOg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Oct 15, 2020 at 9:12 PM Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:

>
> In cached_plan_cost, we do consider the cost of planning, with the
> following
> algorithm.
>
> int nrelations = list_length(plannedstmt->rtable);
>
> result += 1000.0 * cpu_operator_cost * (nrelations + 1);
>
> I run into a case where 10 relations are joined, 3 of them have
> hundreds of partitions. at last nrelations = 421 for this case.
>
> | Plan Type | Estimate Cost | Real Execution Time(ms) | Real Planning
> Time(ms) |
> | Custom Plan | 100867.52 | 13 | 665.816
> |
> | Generic Plan | 104941.86 | 33(ms) | 0.76 (used
> cached plan) |
>
> At last, it chooses the custom plan all the time. so the final performance
> is
> 678ms+, however if it chooses the generic plan, it is 34ms in total. It
> looks
> to me that the planning cost is estimated improperly.
>
> Since we do know the planning time exactly for a custom plan when we call
> cached_plan_cost, if we have a way to convert the real timing to cost,
> then we
> probably can fix this issue.
>
> The cost unit is seq_page_scan, looks we know the latency of seq_page
> read, we can build such mapping. however, the correct seq_page_cost
> detection needs we clear file system cache at least which is
> something we can't do in pg kernel[1]. So any suggestion on this topic?
>

One of the simplest methods might be to just add a new GUC
seq_page_latency to the user (and we can also provide tools to user
to detect their IO latency [1]) If user set seq_page_latency, then we can
do the timing to cost translation. I got the seq_page_latency = 8us on
my local SSD environment before, if the above real case have similar
number, then the planning cost should be 83227 while the current
algorithm sets it to 1055. 83227 in this case is big enough to choose
the generic plan.

[1]
https://www.postgresql.org/message-id/flat/20191127164821.lspxyrf3c5r6zu5n%40development#cf34e9db80326709af892ac64bc4cb45

--
Best Regards
Andy Fan

In response to

improve the algorithm cached_plan_cost with real planning time? at 2020-10-15 13:12:19 from Andy Fan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andy Fan	2020-10-15 19:36:30	Re: Partition prune with stable Expr
Previous Message	Pavel Stehule	2020-10-15 18:51:46	Re: plan cache doesn't clean plans with references to dropped procedures