How to make partitioning scale better for larger numbers of partitions

From: "Kato, Sho" <kato-sho(at)jp(dot)fujitsu(dot)com>
To: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: How to make partitioning scale better for larger numbers of partitions
Date: 2018-07-13 02:58:53
Message-ID: 25C1C6B2E7BE044889E4FE8643A58BA963AA81D9@G01JPEXMBKW03
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I benchmarked on a RANGE partitioned table with 1.1k leaf partitions and no sub-partitioned tables.
But, statement latencies on a partitioned table is much slower than on a non-partitioned table.

UPDATE latency is 210 times slower than a non-partitioned table.
SELECT latency is 36 times slower than a non-partitioned table.
Surprisingly INSERT latency is almost same.

Of course I'm sure table partitioning work well with up to a hundred partitions as written on the postgresql document.
But, my customer will use partitioned table with 1.1k leaf partitions.
So, we need to improve performance.

Any ideas?

The results of pgbench and perf are listed below.

pgbench results
---------------

transaction type: update.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 180 s
number of transactions actually processed: 648
latency average = 278.202 ms
tps = 3.594512 (including connections establishing)
tps = 3.594545 (excluding connections establishing)
statement latencies in milliseconds:
0.011 \set aid random(1, 1100 * 1)
0.004 \set delta random(-5000, 5000)
0.038 BEGIN;
277.005 UPDATE test.accounts SET abalance = abalance + :delta WHERE aid = :aid;
1.140 END;

transaction type: select.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 180 s
number of transactions actually processed: 19415
latency average = 9.281 ms
tps = 107.745068 (including connections establishing)
tps = 107.746067 (excluding connections establishing)
statement latencies in milliseconds:
0.800 \set aid random(1, 1100 * 1)
0.137 \set delta random(-5000, 5000)
1.351 BEGIN;
4.941 SELECT abalance FROM test.accounts WHERE aid = :aid;
2.052 END;

transaction type: insert.sql
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
duration: 180 s
number of transactions actually processed: 31895
latency average = 5.654 ms
tps = 176.865541 (including connections establishing)
tps = 176.867086 (excluding connections establishing)
statement latencies in milliseconds:
2.083 \set aid random(1, 1100 * 1)
0.003 \set delta random(-5000, 5000)
0.029 BEGIN;
3.222 INSERT INTO test.accounts_history (aid, delta, mtime) VALUES (:aid, :delta, CURRENT_TIMESTAMP);
0.317 END;

Top 10 of perf report
------------

UPDATE:
21.33% postgres postgres [.] range_table_mutator
12.57% postgres postgres [.] AllocSetAlloc
4.97% postgres postgres [.] palloc
4.48% postgres postgres [.] make_one_rel
3.96% postgres postgres [.] lappend
2.74% postgres [kernel.kallsyms] [k] get_page_from_freelist
1.87% postgres postgres [.] setup_append_rel_array
1.68% postgres [kernel.kallsyms] [k] list_del
1.64% postgres [kernel.kallsyms] [k] __alloc_pages_nodemask
1.62% postgres [kernel.kallsyms] [k] unmap_vmas

SELECT:
14.72% postgres postgres [.] AllocSetAlloc
5.14% postgres postgres [.] hash_search_with_hash_value
4.23% postgres postgres [.] palloc
4.06% postgres postgres [.] MemoryContextAllocZeroAligned
2.61% postgres postgres [.] copyObjectImpl
2.34% postgres postgres [.] expression_tree_mutator
2.13% postgres [kernel.kallsyms] [k] _spin_lock
1.91% postgres postgres [.] lappend
1.59% postgres [kernel.kallsyms] [k] __link_path_walk
1.50% postgres postgres [.] set_rel_size

INSERT:
20.75% postgres postgres [.] hash_search_with_hash_value
6.03% postgres postgres [.] hash_any
4.88% postgres postgres [.] AllocSetAlloc
4.05% postgres postgres [.] LWLockRelease
4.05% postgres postgres [.] LWLockAcquire
3.27% postgres postgres [.] oid_cmp
3.06% postgres postgres [.] SearchCatCache
2.97% postgres postgres [.] LockReleaseAll
2.57% postgres postgres [.] pg_qsort
2.37% postgres postgres [.] hash_seq_search

The following is information on the environment used for the benchmark.

Server spec
-----------

Server has 16 cpu.
Memory size is 264GB.
Database directory is on SSD.

database tuning
---------------

 shared_buffers = 102GB
max_locks_per_transactions = 1000000

postgresql version
------------------

11beta2 + patch1 + patch2

patch1: Allow direct lookups of AppendRelInfo by child relid
commit 7d872c91a3f9d49b56117557cdbb0c3d4c620687

patch2: 0001-Speed-up-INSERT-and-UPDATE-on-partitioned-tables.patch
https://commitfest.postgresql.org/18/1690/

table definition
----------------

create table test.accounts(aid serial, abalance int) partition by range(aid));
create table test.accounts_history(id serial, aid int, delta int, mtime timestamp without time zone) partition by range(aid);

create table test.account_part_1 partition of test.accounts for values from (1) to (2);
create table test.account_part_2 partition of test.accounts for values from (2) to (3);
.
.
create table test.account_part_1100 partition of test.accounts for values from (1100) to (1101);

accounts_history is also partitioned in the same way.

There is only one data in each leaf partitions for UPDATE/SELECT benchmark.

regards,

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2018-07-13 03:00:01 Re: Concurrency bug in UPDATE of partition-key
Previous Message Andres Freund 2018-07-13 02:54:12 Re: requested timeline ... does not contain minimum recovery point ...