improve the algorithm cached_plan_cost with real planning time?

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: improve the algorithm cached_plan_cost with real planning time?
Date: 2020-10-15 13:12:19
Message-ID: CAKU4AWpYN-SFChsyLgAq7--5sZ1ZSKxi4nJUxcfqCa0OjvY_iw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In cached_plan_cost, we do consider the cost of planning, with the following
algorithm.

int nrelations = list_length(plannedstmt->rtable);

result += 1000.0 * cpu_operator_cost * (nrelations + 1);

I run into a case where 10 relations are joined, 3 of them have
hundreds of partitions. at last nrelations = 421 for this case.

| Plan Type | Estimate Cost | Real Execution Time(ms) | Real Planning
Time(ms) |
| Custom Plan | 100867.52 | 13 | 665.816
|
| Generic Plan | 104941.86 | 33(ms) | 0.76 (used
cached plan) |

At last, it chooses the custom plan all the time. so the final performance
is
678ms+, however if it chooses the generic plan, it is 34ms in total. It
looks
to me that the planning cost is estimated improperly.

Since we do know the planning time exactly for a custom plan when we call
cached_plan_cost, if we have a way to convert the real timing to cost, then
we
probably can fix this issue.

The cost unit is seq_page_scan, looks we know the latency of seq_page
read, we can build such mapping. however, the correct seq_page_cost
detection needs we clear file system cache at least which is
something we can't do in pg kernel[1]. So any suggestion on this topic?

note that both plans have no plan time partition prune and have run time
partition prune, so the issue at [2] probably doesn't impact this.

[1]
https://www.postgresql.org/message-id/flat/20191127164821.lspxyrf3c5r6zu5n%40development#cf34e9db80326709af892ac64bc4cb45

[2]
https://www.postgresql.org/message-id/CAKU4AWqUJmQdu9qf_pXxBYETkiXhTaXAQ_qtX7wxeLw27phdOw@mail.gmail.com

--
Best Regards
Andy Fan

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2020-10-15 14:01:23 Re: Parallel Append can break run-time partition pruning
Previous Message Alvaro Herrera 2020-10-15 12:52:21 Re: pgsql: Restore replication protocol's duplicate command tags