Re: BUG #17731: Server doesn't start after abnormal shutdown while creating unlogged tables

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: litskevichkarina(at)gmail(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org, kyzevan23(at)mail(dot)ru
Subject: Re: BUG #17731: Server doesn't start after abnormal shutdown while creating unlogged tables
Date: 2023-04-25 03:33:32
Message-ID: 20230425.123332.1657429858787993644.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

At Mon, 24 Apr 2023 15:59:38 +0300, Karina Litskevich <litskevichkarina(at)gmail(dot)com> wrote in
> So in case before WAL recovery main fork exists and init fork isn't, and during
> recovery init fork is created, we get this problem. The second
> ResetUnloggedRelations() call sees just created init fork and tries to create a
> main fork from it expecting that the old main fork was already deleted by the
> first ResetUnloggedRelations() call, but it wasn't because the main fork hasn't
> corresponding init fork at that moment yet.

Seems right.

> Theoretically, this applies to all versions, but the script somehow doesn't lead
> to an error on REL_11_STABLE. I haven't investigated it yet.
>
> I see two solutions: 1) keep init fork files until the next checkpoint as well
> as main fork files, 2) ignore (rewrite if exists) presence of an empty main
> fork file when copying from init fork. I found the latter less elegant so I
> implemented the first one. The patch is attached.

The init-fork related code has some other issues with crash-restart. A
minor one is that the crash of the creating transaction for a unlogged
relation leaves orphan init fork files. I haven't fully chased the
specific issue rased here, but I think the common cause in the cases
is that the file operations around unlogged files are not fully
transactional. There is a proposed patchset [1], the first patch of
which makes storage file creation and deletion transactional and
crash-safe. As far as I see it seems to fix this case, too.

The latest version of it that posed to this ML [2] needs a rebase and
some fix for now, though. (I'll post a rebased version, soon.)

As for the proposed patch, I haven't looked closely, but I don't think
delaying init-file removal is the right approach. The reason of the
delay, as mentioned, is someone might be accessing the file (causing
deletion failure on some platforms). Init-fork files don't fall into
that category.

regards.

[1] https://commitfest.postgresql.org/43/3461/

[2] https://www.postgresql.org/message-id/20230317.151634.1038632016265639446.horikyota.ntt%40gmail.com

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Kyotaro Horiguchi 2023-04-25 04:14:52 Re: BUG #17903: There is a bug in the KeepLogSeg()
Previous Message Nathan Bossart 2023-04-24 19:14:52 Re: BUG #17903: There is a bug in the KeepLogSeg()