replay of CREATE TABLESPACE eats data at wal_level=minimal

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: replay of CREATE TABLESPACE eats data at wal_level=minimal
Date: 2021-08-09 17:08:42
Message-ID: CA+TgmoaLO9ncuwvr2nN-J4VEP5XyAcy=zKiHxQzBbFRxxGxm0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

To reproduce, initialize a cluster with wal_level=minimal and
max_wal_senders=0. Then from psql:

\! mkdir /tmp/goose

CHECKPOINT;
CREATE TABLESPACE goose LOCATION '/tmp/goose';
SET wal_skip_threshold=0;
BEGIN;
CREATE TABLE wild (a int, b text) TABLESPACE goose;
INSERT INTO wild VALUES (1, 'chase');
COMMIT;
SELECT * FROM wild;

As expected, you will see one row in table 'wild'. Now perform an
immediate shutdown. Restart the server. Table 'wild' is now empty. The
problem appears to be that tblspc_redo() calls
create_tablespace_directories(), which says:

/*
* Our theory for replaying a CREATE is to forcibly drop the target
* subdirectory if present, and then recreate it. This may be more
* work than needed, but it is simple to implement.
*/

Unfortunately, this theory (which dates to
c86f467d18aa58e18fd85b560b46d8de014e6017, vintage 2010, by Bruce) is
correct only with wal_level>minimal. At wal_level='minimal', we can
replay the record to recreate the relfilenode, but not the records
that would have created the contents. However, note that if the table
is smaller than wal_skip_threshold, then we'll log full-page images of
the contents at commit time even at wal_level='minimal' after which we
have no problem. As far as I can see, this bug has "always" existed,
but before c6b92041d38512a4176ed76ad06f713d2e6c01a8 (2020, Noah) you
would have needed a different test case. Specifically, you would have
needed to use COPY to put the row in the table, and you would have
needed to omit setting wal_skip_threshold since it didn't exist yet.

I don't presently have a specific idea about how to fix this.

--
Robert Haas
EDB: http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-08-09 17:30:00 Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?
Previous Message kuroda.hayato@fujitsu.com 2021-08-09 17:01:57 RE: ECPG bug fix: DECALRE STATEMENT and DEALLOCATE, DESCRIBE