BUG #16260: Repetitive accessing to partitioned table inside transaction causes server process crash

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: mingjuwu0505(at)gmail(dot)com
Subject: BUG #16260: Repetitive accessing to partitioned table inside transaction causes server process crash
Date: 2020-02-16 09:15:40
Message-ID: 16260-ceaef56a5acbf449@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16260
Logged by: Michael
Email address: mingjuwu0505(at)gmail(dot)com
PostgreSQL version: 12.2
Operating system: Mint Linux 18.3
Description:

Hello,

Accessing to a partitioned table with large partitions (20K) inside a
transaction can cause postgres to use large amount of memory (>30GB) and
ultimately leading to server process crash. The system that I am using has
48GB physical memory and 72GB swap.

Setup and stress testings steps are attached below. With default
configuration of "max_locks_per_transaction", it will show:

ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.

After increasing "max_locks_per_transaction" to 6400, while stress() is
running, the postgres process will continue to consume more memory.
Ultimately, in pgAdmin 4, console will show "Connection to the server has
been lost." The following messages can be found in postgres log:

2020-02-16 17:01:35.449 CST [22652] LOG: server process (PID 23149) was
terminated by signal 9: Killed
2020-02-16 17:01:35.449 CST [22652] DETAIL: Failed process was running:
call stress();
2020-02-16 17:01:35.449 CST [22652] LOG: terminating any other active
server processes
2020-02-16 17:01:35.458 CST [22659] WARNING: terminating connection because
of crash of another server process
2020-02-16 17:01:35.458 CST [22659] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2020-02-16 17:01:35.458 CST [22659] HINT: In a moment you should be able to
reconnect to the database and repeat your command.
2020-02-16 17:01:35.487 CST [22664] postgres(at)17ptt_test_db WARNING:
terminating connection because of crash of another server process
2020-02-16 17:01:35.487 CST [22664] postgres(at)17ptt_test_db DETAIL: The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally and
possibly corrupted shared memory.
2020-02-16 17:01:35.487 CST [22664] postgres(at)17ptt_test_db HINT: In a
moment you should be able to reconnect to the database and repeat your
command.
2020-02-16 17:01:35.491 CST [22652] LOG: all server processes terminated;
reinitializing
2020-02-16 17:01:35.652 CST [4757] LOG: database system was interrupted;
last known up at 2020-02-16 16:56:26 CST
2020-02-16 17:01:35.939 CST [4758] postgres(at)17ptt_test_db FATAL: the
database system is in recovery mode
2020-02-16 17:01:35.940 CST [4759] postgres(at)17ptt_test_db FATAL: the
database system is in recovery mode
2020-02-16 17:01:38.622 CST [4757] LOG: database system was not properly
shut down; automatic recovery in progress
2020-02-16 17:01:38.630 CST [4757] LOG: redo starts at 9/D5353028
2020-02-16 17:01:38.630 CST [4757] LOG: invalid record length at
9/D5353110: wanted 24, got 0
2020-02-16 17:01:38.630 CST [4757] LOG: redo done at 9/D53530D8
2020-02-16 17:01:38.684 CST [22652] LOG: database system is ready to accept
connections

In the stress() procedure, replacing DELETE with UPDATE statement has same
effect.

Best Regards,
Michael

---------
-- SETUP
---------

CREATE TABLE test_table (
id bigint NOT NULL,
category bigint NOT NULL
) PARTITION BY LIST (category);

CREATE OR REPLACE PROCEDURE create_test_table_paritions()
AS $$
DECLARE
counter INTEGER := 1;
BEGIN
WHILE counter <= 20000 LOOP
EXECUTE 'CREATE TABLE test_table_' || counter || ' PARTITION OF test_table
FOR VALUES IN (' || counter || ')';
counter := counter + 1;
END LOOP;
END
$$
LANGUAGE plpgsql;

call create_test_table_paritions();

--------------
-- STRESS TEST
--------------

CREATE OR REPLACE PROCEDURE stress()
AS $$
DECLARE
counter INTEGER := 1;
BEGIN
WHILE counter <= 20000 LOOP
DELETE FROM test_table WHERE category = counter;
raise notice '%', counter;
counter := counter + 1;
END LOOP;
END
$$
LANGUAGE plpgsql;

call stress();

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2020-02-16 17:21:17 Re: BUG #16259: Cannot Use "pg_ctl start -l logfile" on Clean Install on Windows Server 2012/2016
Previous Message Heath Lord 2020-02-16 00:23:11 Re: BUG #16259: Cannot Use "pg_ctl start -l logfile" on Clean Install on Windows Server 2012/2016