Re: Track Oldest Initialized WAL Buffer Page

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>
Subject: Re: Track Oldest Initialized WAL Buffer Page
Date: 2023-07-03 13:27:27
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 07/02/2023 16:00, Bharath Rupireddy wrote:
> Hi,
> While working on [1], I was looking for a quick way to tell if a WAL
> record is present in the WAL buffers array without scanning but I
> couldn't find one.

/* The end-ptr of the page that contains the record */
expectedEndPtr += XLOG_BLCKSZ - recptr % XLOG_BLCKSZ;

/* get the buffer where the record is, if it's in WAL buffers at all */
idx = XLogRecPtrToBufIdx(recptr);

/* prevent the WAL buffer from being evicted while we look at it */
LWLockAcquire(WALBufMappingLock, LW_SHARED);

/* Check if the page we're interested in is in the buffer */
found = XLogCtl->xlblocks[idx] == expectedEndPtr;

LWLockRelease(WALBufMappingLock, LW_SHARED);

> Hence, I put up a patch that basically tracks the
> oldest initialized WAL buffer page, named OldestInitializedPage, in
> XLogCtl. With OldestInitializedPage, we can easily illustrate WAL
> buffers array properties:
> 1) At any given point of time, pages in the WAL buffers array are
> sorted in an ascending order from OldestInitializedPage till
> InitializedUpTo. Note that we verify this property for assert-only
> builds, see IsXLogBuffersArraySorted() in the patch for more details.
> 2) OldestInitializedPage is monotonically increasing (by virtue of how
> postgres generates WAL records), that is, its value never decreases.
> This property lets someone read its value without a lock. There's no
> problem even if its value is slightly stale i.e. concurrently being
> updated. One can still use it for finding if a given WAL record is
> available in WAL buffers. At worst, one might get false positives
> (i.e. OldestInitializedPage may tell that the WAL record is available
> in WAL buffers, but when one actually looks at it, it isn't really
> available). This is more efficient and performant than acquiring a
> lock for reading. Note that we may not need a lock to read
> OldestInitializedPage but we need to update it holding
> WALBufMappingLock.

You actually hint at the above solution here, so I'm confused. If you're
OK with slightly stale results, you can skip the WALBufferMappingLock
above too, and perform an atomic read of xlblocks[idx] instead.

> 3) One can start traversing WAL buffers from OldestInitializedPage
> till InitializedUpTo to list out all valid WAL records and stats, and
> expose them via SQL-callable functions to users, for instance, as
> pg_walinspect functions.
> 4) WAL buffers array is inherently organized as a circular, sorted and
> rotated array with OldestInitializedPage as pivot/first element of the
> array with the property where LSN of previous buffer page (if valid)
> is greater than OldestInitializedPage and LSN of the next buffer page
> (if
> valid) is greater than OldestInitializedPage.

These properties are true, maybe we should document them explicitly in a
comment. But I don't see the point of tracking OldestInitializedPage. It
seems cheap enough that we could, if there's a need for it, but I don't
see the need.

Heikki Linnakangas
Neon (

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2023-07-03 13:42:44 Re: Add support for AT LOCAL
Previous Message Daniel Gustafsson 2023-07-03 13:27:22 Re: generic plans and "initial" pruning