|From:||"Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>|
|To:||'Amit Langote' <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>|
|Cc:||David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||RE: speeding up planning with partitions|
|Views:||Raw Message | Whole Thread | Download mbox|
I ran the performance tests for no prepared query and for prepared query with
plan_cache_mode='auto' and plan_cache_mode='force_custom_plan'. I also changed
number of partitions as 256 or 4096. I ran the tests on master and v9-patched.
plan_cache_mode = 'auto' or 'force_custom_plan'
max_parallel_workers = 0
max_parallel_workers_per_gather = 0
max_locks_per_transaction = 4096
[partitioning table definitions(with 4096 partitions)]
create table rt (a int, b int, c int) partition by range (a);
select 'create table rt' || x::text || ' partition of rt for values from (' ||
(x)::text || ') to (' || (x+1)::text || ');' from generate_series(1, 4096) x;
[pgbench(with 4096 partitions)]
no prepared: pgbench -n -f select4096.sql -T 60
prepared: pgbench -n -f select4096.sql -T 60 -M prepared
\set a random(1, 4096)
select a from rt where a = :a;
part_num no-prepared auto force_custom_plan (1-auto/force_custom_plan)
256 604 571 576 0.01
4096 17.5 17.5 15.1 -0.16
part_num no-prepared auto force_custom_plan
256 8614 9446 9384 -0.006
4096 7158 7165 7864 0.089
There are almost no difference between auto and force_custom_plan with 256
partitions, but there are difference between auto and force_custom_plan with
4096 partitions. While auto is faster than force_custom_plan on master,
force_custom_plan is faster than auto on patched.
I wonder why force_custom_plan is faster than auto after applied the patch.
When we use PREPARE-EXECUTE, a generic plan is created and used if its cost is
cheaper than creating and using a custom plan with plan_cache_mode='auto',
while a custom plan is always created and used with plan_cache_mode='force_custom_plan'.
So one can think the difference in above results is because of creating or
using a generic plan.
I checked how many times a generic plan is created during executing pgbench and
found a generic plan is created only once and custom plans are created at other
times with plan_cache_mode='auto'. I also checked the time of creating a
generic plan, but it didn't take so much(250ms or so with 4096 partitions). So
the time of creating a generic plan does not affect the performance.
Currently, a generic plan is created at sixth time of executing EXECUTE query.
I changed it to more later (ex. at 400,000th time of executing EXECUTE query on
master with 4096 partitions, because 7000TPS x 60sec=420,0000 transactions are
run while executing pgbench.), then there are almost no difference between auto
and force_custom_plan. I think that creation of a generic plan affects the time
of executing queries which are ordered after creating generic plan.
If my assumption is right, we can expect some additional process is occurred at
executing queries ordered after creating a generic plan, which results in auto is
slower than force_custom_plan because of additional process. But looking at
above results, on master with 4096 partitions, auto is faster than force_custom_plan.
So now I am confused.
Do you have any ideas what does affect the performance?
|Next Message||Kyotaro HORIGUCHI||2019-01-09 02:55:57||Re: Improve selectivity estimate for range queries|
|Previous Message||Noah Misch||2019-01-09 01:44:30||Re: emacs configuration for new perltidy settings|