O(n^2) system calls in RemoveOldXlogFiles()

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: O(n^2) system calls in RemoveOldXlogFiles()
Date: 2021-01-11 03:35:56
Message-ID: CA+hUKG+DRiF9z1_MU4fWq+RfJMxP7zjoptfcmuCFPeO4JM2iVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I noticed that RemoveXlogFile() has this code:

/*
* Before deleting the file, see if it can be recycled as a future log
* segment. Only recycle normal files, pg_standby for example can create
* symbolic links pointing to a separate archive directory.
*/
if (wal_recycle &&
endlogSegNo <= recycleSegNo &&
lstat(path, &statbuf) == 0 && S_ISREG(statbuf.st_mode) &&
InstallXLogFileSegment(&endlogSegNo, path,
true,
recycleSegNo, true))
{
ereport(DEBUG2,
(errmsg("recycled write-ahead log file \"%s\"",
segname)));
CheckpointStats.ckpt_segs_recycled++;
/* Needn't recheck that slot on future iterations */
endlogSegNo++;
}

I didn't check the migration history of this code but it seems that
endlogSegNo doesn't currently have the right scoping to achieve the
goal of that last comment, so checkpoints finish up repeatedly search
for the next free slot, starting at the low end each time, like so:

stat("pg_wal/00000001000000000000004F", {st_mode=S_IFREG|0600,
st_size=16777216, ...}) = 0
...
stat("pg_wal/000000010000000000000073", 0x7fff98b9e060) = -1 ENOENT
(No such file or directory)

stat("pg_wal/00000001000000000000004F", {st_mode=S_IFREG|0600,
st_size=16777216, ...}) = 0
...
stat("pg_wal/000000010000000000000074", 0x7fff98b9e060) = -1 ENOENT
(No such file or directory)

... and so on until we've recycled all our recyclable segments. Ouch.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey V. Lepikhov 2021-01-11 04:07:15 Re: Removing unneeded self joins
Previous Message Bharath Rupireddy 2021-01-11 03:21:46 Re: Parallel Inserts in CREATE TABLE AS