Re: BUG #17731: Server doesn't start after abnormal shutdown while creating unlogged tables

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Karina Litskevich <litskevichkarina(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, kyzevan23(at)mail(dot)ru
Subject: Re: BUG #17731: Server doesn't start after abnormal shutdown while creating unlogged tables
Date: 2023-05-01 07:33:31
Message-ID: ZE9rSxi0BCHfUH0x@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Apr 24, 2023 at 03:59:38PM +0300, Karina Litskevich wrote:
> For unlogged tables and indexes init forks are created to simulate truncate on
> server startup. In StartupXLOG() every main fork, for which corresponding init
> fork exists, is deleted before replaying WAL, and then new main fork is created
> by copying init fork:
>
> ResetUnloggedRelations(UNLOGGED_RELATION_CLEANUP);
> ...
> PerformWalRecovery();
> ...
> ResetUnloggedRelations(UNLOGGED_RELATION_INIT);
>
> So in case before WAL recovery main fork exists and init fork isn't, and during
> recovery init fork is created, we get this problem. The second
> ResetUnloggedRelations() call sees just created init fork and tries to create a
> main fork from it expecting that the old main fork was already deleted by the
> first ResetUnloggedRelations() call, but it wasn't because the main fork hasn't
> corresponding init fork at that moment yet.
>
> If you try to start server again, it will start successfully, as this time both
> init and main forks will present from the beginning.

So, from what I read, what you basically mean is a sequence like that:
1) create unlogged table.
2) drop it.
3) Stop the server in immediate mode before the next checkpoint has
the time to finish cleaning up the main fork still lying around. At
this point the server has the truncated main fork, but not the init
fork as it has already been removed.
4) Restart server, recovery begins.
5) ResetUnloggedRelations(UNLOGGED_RELATION_CLEANUP) happens, sees
only what looks like a main fork, thinks there is nothing to do
because there is no init fork.
6) Begin WAL redo,
7) Replay the record that created the init fork.
8) Finish recovery.
9) ResetUnloggedRelations(UNLOGGED_RELATION_INIT) sees both the init
fork and the main fork. We would do a copy_dir() from the init file
to the main fork, that fails on EEXIST.

Between points 7 and 8, there is something I am not really following,
though. The deletion of all the forks of an unlogged table should be
replayed as well until we reach consistency, no? At redo, the cleanup
of the forks is done when the COMMIT record of the transaction that
dropped the table is replayed, rather than delayed at checkpoint as a
sync request. Hence, the init fork previously created should not
exist to begin with, no? Am I missing something?
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2023-05-01 07:40:42 Re: BUG #17906: Segmentation fault and database crash during procedure call
Previous Message Michael Paquier 2023-05-01 05:25:50 Re: pg_basebackup: errors on macOS on directories with ".DS_Store" files