Recovery mode with partitioned tables and ANY(NULL) in WHERE clause

From: Piotr Włodarczyk <piotrwlodarczyk89(at)gmail(dot)com>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause
Date: 2019-08-09 16:28:37
Message-ID: CAP-dhMr+vRpwizEYjUjsiZ1vwqpohTm+3Pbdt6Pr7FEgPq9R0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello everyone,

We're going to introduce in production environment solution based on table
partitioning.

I think we found BUG in version from 11.4 to 11.5 (maybe and previous).

We have one big table, let's say "book". To improve manage and maintenance
we've decided to partition the table in two partitions levels. First based
on ranges of one column and level on another one. After release our changes
on develop stage we've met unexpected behaviour: database randomly went
into recovery mode. After a little investigation we found query that cause
database crash. We've noticed before that in good practices in partitioning
was point saying about limit for one hundred of partitions. We have +/- 400
so we thought it can be problem. For example with memory or something like
that. We've tried to "explain" problematic query to show planner but
explain fails too (without analyze). Finally we've discovered that problem
occures likewise when we operate on empty tables with two ranged partitions
and one default on one level (probably error occurs too on smaller count of
partitions). Below reproduction path:

PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-36), 64-bit, or
PostgreSQL 11.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-36), 64-bit, or
PostgreSQL 11.5, compiled by Visual C++ build 1914, 64-bit.

Table create:
CREATE TABLE book (
id SERIAL,
id_publishing_house BIGINT,
date_of_publish DATE
) PARTITION BY RANGE (id_publishing_house);

-- First level
CREATE TABLE book_default PARTITION OF book DEFAULT;
CREATE TABLE book_10000000 PARTITION OF book FOR VALUES FROM (1) TO
(10000001);
CREATE TABLE book_20000000 PARTITION OF book FOR VALUES FROM (10000001) TO
(20000001);

QUERY:
SELECT * FROM book WHERE id_publishing_house = ANY(NULL::BIGINT[]);

This query is crashing database. What is more interesting:

SELECT * FROM book WHERE id_publishing_house IN (NULL);

works fine.

Logs:

2019-08-09 18:02:29.560 CEST [13500] LOG: database system was shut down at
2019-08-09 18:02:20 CEST
2019-08-09 18:02:29.591 CEST [18256] LOG: database system is ready to
accept connections
2019-08-09 18:02:53.566 CEST [18256] LOG: server process (PID 2076) was
terminated by exception 0xC0000005
2019-08-09 18:02:53.566 CEST [18256] DETAIL: Failed process was running:
SELECT * FROM book WHERE id_publishing_house = ANY(NULL::BIGINT[]);
2019-08-09 18:02:53.566 CEST [18256] HINT: See C include file "ntstatus.h"
for a description of the hexadecimal value.
2019-08-09 18:02:53.566 CEST [18256] LOG: terminating any other active
server processes
2019-08-09 18:02:53.571 CEST [16400] WARNING: terminating connection
because of crash of another server process
2019-08-09 18:02:53.571 CEST [16400] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2019-08-09 18:02:53.571 CEST [16400] HINT: In a moment you should be able
to reconnect to the database and repeat your command.
2019-08-09 18:02:53.594 CEST [4980] WARNING: terminating connection
because of crash of another server process
2019-08-09 18:02:53.594 CEST [4980] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2019-08-09 18:02:53.594 CEST [4980] HINT: In a moment you should be able
to reconnect to the database and repeat your command.
2019-08-09 18:02:53.634 CEST [18256] LOG: all server processes terminated;
reinitializing
2019-08-09 18:02:53.756 CEST [14972] FATAL: the database system is in
recovery mode
2019-08-09 18:02:53.769 CEST [13816] LOG: database system was interrupted;
last known up at 2019-08-09 18:02:29 CEST
2019-08-09 18:02:53.778 CEST [15388] FATAL: the database system is in
recovery mode
2019-08-09 18:02:54.640 CEST [12560] FATAL: the database system is in
recovery mode
2019-08-09 18:02:54.904 CEST [18008] FATAL: the database system is in
recovery mode
2019-08-09 18:02:55.637 CEST [16120] FATAL: the database system is in
recovery mode
2019-08-09 18:02:55.965 CEST [13816] LOG: database system was not properly
shut down; automatic recovery in progress
2019-08-09 18:02:55.972 CEST [13816] LOG: redo starts at 0/168E160
2019-08-09 18:02:55.973 CEST [13816] LOG: redo done at 0/168E160
2019-08-09 18:02:56.037 CEST [18256] LOG: database system is ready to
accept connections

--

Pozdrawiam
Piotr Włodarczyk

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-08-09 16:45:42 Re: Recovery mode with partitioned tables and ANY(NULL) in WHERE clause
Previous Message Merlin Moncure 2019-08-09 12:59:31 Re: BUG #15940: json_populate_recordset fails with ERROR: record type has not been registered