Re: False "pg_serial": apparent wraparound” in logs

From: "Imseih (AWS), Sami" <simseih(at)amazon(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: False "pg_serial": apparent wraparound” in logs
Date: 2023-09-29 23:16:03
Message-ID: CCEF0BF0-2A3E-45E5-934C-3DD3A9ED8C82@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> I don't really understand what exactly the problem is, or how this fixes
> it. But this doesn't feel right:

As the repro show, false reports of "pg_serial": apparent wraparound”
messages are possible. For a very busy system which checkpoints frequently
and heavy usage of serializable isolation, this will flood the error logs, and
falsely cause alarm to the user. It also prevents the SLRU from being
truncated.

In my repro, I end up seeing, even though the SLRU does not wraparound.
" LOG: could not truncate directory "pg_serial": apparent wraparound"

> Firstly, isn't headPage == 0 also a valid value? We initialize headPage
> to -1 when it's not in use.

Yes. You are correct. This is wrong.

> Secondly, shouldn't we set it to the page corresponding to headXid
> rather than tailXid.

> Thirdly, I don't think this code should have any business setting
> latest_page_number directly. latest_page_number is set in
> SimpleLruZeroPage().

Correct, after checking again, I do realize the patch is wrong.

> Are we missing a call to SimpleLruZeroPage() somewhere?

That is a good point.

The initial idea was to advance the latest_page_number
during SerialSetActiveSerXmin, but the initial approach is
obviously wrong.

When SerialSetActiveSerXmin is called for a new active
serializable xmin, and at that point we don't need to keep any
any earlier transactions, should SimpleLruZeroPage be called
to ensure there is a target page for the xid?

I tried something like below, which fixes my repro, by calling
SimpleLruZeroPage at the end of SerialSetActiveSerXmin.

@@ -953,6 +953,8 @@ SerialGetMinConflictCommitSeqNo(TransactionId xid)
static void
SerialSetActiveSerXmin(TransactionId xid)
{
+ int targetPage = SerialPage(xid);
+
LWLockAcquire(SerialSLRULock, LW_EXCLUSIVE);

/*
@@ -992,6 +994,9 @@ SerialSetActiveSerXmin(TransactionId xid)

serialControl->tailXid = xid;

+ if (serialControl->headPage != targetPage)
+ SimpleLruZeroPage(SerialSlruCtl, targetPage);
+
LWLockRelease(SerialSLRULock);
}

Regards,

Sami

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-09-29 23:25:56 Re: dikkop seems unhappy because of openssl stuff (FreeBSD 14-BETA1)
Previous Message David Steele 2023-09-29 22:56:10 Re: how to manage Cirrus on personal repository