Re: The same 2PC data maybe recovered twice

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: "CAI, Mengjuan" <mengjuan(dot)cmj(at)alibaba-inc(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: The same 2PC data maybe recovered twice
Date: 2025-07-14 08:14:43
Message-ID: aHS8c8-dPyTLijOW@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Mon, Jul 14, 2025 at 02:46:25PM +0800, CAI, Mengjuan wrote:
> Thank you for your reply. I reviewed the thread you mentioned, and
> it seems that the issue needing to be fixed is not the same as the
> one I previously raised.
> I am considering whether the 2PC file check logic in
> PrepareRedoAdd() can be modified. Currently, each time a
> XLOG_XACT_PREPARE WAL entry is replayed, it checks for the
> corresponding 2PC file using access(). Each access operation creates
> a dentry in the OS, and in most cases, the file being accessed does
> not exist. When there are many 2PC transactions, this logic may lead
> to an increase in OS slab memory. In worse scenarios, it could cause
> the reference count of the parent directory's dentry to overflow,
> potentially leading to an OS crash. Typically, when accessing
> existing files, the disk will fill up before the dentry reference
> count overflows.
> Therefore, I would like to propose a modification. Attached is my
> patch for your review, and I hope you can take a look at it.

@@ -2520,8 +2520,16 @@ PrepareRedoAdd(FullTransactionId fxid, char *buf,
[...]
- if (!XLogRecPtrIsInvalid(start_lsn))
+ if (!XLogRecPtrIsInvalid(start_lsn) && !reachedConsistency)

Actually, what you are doing is incorrect because we could miss some
ERRORs for example if a base backup was incorrect if come files were
present in pg_twophase?

It's not really true that what you are changing here has no
interaction with the beginning of recovery, the other thread is about
the fact that reading the 2PC files from disk when !reachedConsistency
is a bad concept that we should avoid, impacting the assumption the
code path you are changing here relies on. At the end, it may be
possible that we're able to remove this check entirely..
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message suyu.cmj 2025-07-14 09:57:18 Re: The same 2PC data maybe recovered twice
Previous Message 2025-07-14 06:46:25 Re: The same 2PC data maybe recovered twice

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Kukushkin 2025-07-14 08:21:02 Re: Requested WAL segment xxx has already been removed
Previous Message Masahiro Ikeda 2025-07-14 08:13:15 Re: Assertion failure in smgr.c when using pg_prewarm with partitioned tables