[Bug Fix] Logical replication on partition table is very slow and CPU is 99%

From: 赵锐 <875941708(at)qq(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: [Bug Fix] Logical replication on partition table is very slow and CPU is 99%
Date: 2021-01-04 13:04:51
Message-ID: tencent_6E296D2F7D70AFC90D83353B69187C3AA507@qq.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Logical replication on partition table is very slow and CPU is 99%.

To reproduce this problem,

pg1 for publish

pgbench -i -s 1000

create table pgbench_accounts_copy(aid integer, bid integer, abalance integer, filler character(84)) partition by range (aid);

create table pgbench_accounts_copy_a0 partition of pgbench_accounts_copy for values from (0) to (10000000);

create table pgbench_accounts_copy_a1 partition of pgbench_accounts_copy for values from (10000000) to (20000000);

create table pgbench_accounts_copy_a2 partition of pgbench_accounts_copy for values from (20000000) to (30000000);

create table pgbench_accounts_copy_a3 partition of pgbench_accounts_copy for values from (30000000) to (40000000);

create table pgbench_accounts_copy_a4 partition of pgbench_accounts_copy for values from (40000000) to (50000000);

create table pgbench_accounts_copy_a5 partition of pgbench_accounts_copy for values from (50000000) to (60000000);

create table pgbench_accounts_copy_a6 partition of pgbench_accounts_copy for values from (60000000) to (70000000);

create table pgbench_accounts_copy_a7 partition of pgbench_accounts_copy for values from (70000000) to (80000000);

create table pgbench_accounts_copy_a8 partition of pgbench_accounts_copy for values from (80000000) to (90000000);

create table pgbench_accounts_copy_a9 partition of pgbench_accounts_copy for values from (90000000) to (100000000);

create publication my_publication for table pgbench_accounts_copy with (publish_via_partition_root = true);

pg2 for subscribe

create table pgbench_accounts_copy(aid integer, bid integer, abalance integer, filler character(84)) partition by range (aid);

create table pgbench_accounts_copy_a0 partition of pgbench_accounts_copy for values from (0) to (10000000);

create table pgbench_accounts_copy_a1 partition of pgbench_accounts_copy for values from (10000000) to (20000000);

create table pgbench_accounts_copy_a2 partition of pgbench_accounts_copy for values from (20000000) to (30000000);

create table pgbench_accounts_copy_a3 partition of pgbench_accounts_copy for values from (30000000) to (40000000);

create table pgbench_accounts_copy_a4 partition of pgbench_accounts_copy for values from (40000000) to (50000000);

create table pgbench_accounts_copy_a5 partition of pgbench_accounts_copy for values from (50000000) to (60000000);

create table pgbench_accounts_copy_a6 partition of pgbench_accounts_copy for values from (60000000) to (70000000);

create table pgbench_accounts_copy_a7 partition of pgbench_accounts_copy for values from (70000000) to (80000000);

create table pgbench_accounts_copy_a8 partition of pgbench_accounts_copy for values from (80000000) to (90000000);

create table pgbench_accounts_copy_a9 partition of pgbench_accounts_copy for values from (90000000) to (100000000);

create subscription sub1 CONNECTION 'host=xxxx port=xxxx user=postgres dbname=postgres' publication my_publication;

On pg1

insert into pgbench_accounts_copy select * from pgbench_accounts where aid&gt;0 and aid<100000000;

The replication will be very slow, it lasts for 5 days in my environment and will continue. What's more, the CPU is 99% but actually do nothing.&nbsp;

In fact, there is a hidden bug when replication on partition table. When we publish partition tables via root table, the reference of root table on current owner is added, but the decrement of reference is missed. When the reference is large enough, It takes so much time and CPU to re-hash and resolve hash collision.&nbsp;

This patch adds the missed decrement to resolve the problem.

Previous discussion is here:&nbsp;https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ(at)mail(dot)gmail(dot)com
And I believe patch #83fd453 introduce this problem.

Thanks,

Mark Zhao

Attachment Content-Type Size
0001-add-missed-RelationClose-after-RelationIdGetRelation.patch application/octet-stream 2.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amine Tengilimoglu 2021-01-04 13:12:34 Re: pg_rewind restore_command issue in PG12
Previous Message Bharath Rupireddy 2021-01-04 12:47:48 Re: Parallel Inserts in CREATE TABLE AS