Re: pgsql: Prevent invalidation of newly synced replication slots.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Amit Kapila <akapila(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Prevent invalidation of newly synced replication slots.
Date: 2026-01-28 16:53:59
Message-ID: j3fa57s3im2wbuhz33cmbg56lgpbrtt25qq7irou336pawd2jo@lloejwmt3sxs
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi,

On 2026-01-28 18:05:10 +0530, Amit Kapila wrote:
> I noticed that the previous test didn't quitted the background psql
> session used for concurrent checkpoint. By quitting that background
> session, the test passed for me consistently. See attached. It is
> written in comments atop background_psql: "Be sure to "quit" the
> returned object when done with it.". Now, this background session
> doesn't directly access the backup_label file but it could be
> accessing one of the parent directories where backup_label is present.

Hm. I've seen (and complained about [1]) weird errors when not shutting down
IPC::Run processes - mostly the test hanging at the end though.

> One of gen-AI says as follows: "In Windows, MoveFileEx (Error 32:
> ERROR_SHARING_VIOLATION) can fail if a process is accessing the file's
> parent directory in a way that creates a lock. While the error message
> usually points to the file itself, the parent folder is a critical
> part of the operation.".

I don't see how that could be the plausible reason - after all we have a lot
of other open files open in the relevant directories. But: It seems to fix
the problem for you, so it's worth going for it, as it's the right thing to do
anyway.

I think it'd be worth, separately from committing the workaround, trying to
figure out what's holding the file open. Andrey observed that the tests pass
for him with a much longer timeout. If you can reproduce it locally, I'd try
to use something like [2] to see what has handles open to the relevant files,
while waiting for the timeout.

Greetings,

Andres Freund

[1] https://postgr.es/m/20240619030727.ldp3mcrjbd5fqwj5%40awork3.anarazel.de
[2] https://learn.microsoft.com/en-us/sysinternals/downloads/handle

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Robert Haas 2026-01-28 17:08:03 pgsql: Allow for plugin control over path generation strategies.
Previous Message Amit Kapila 2026-01-28 15:00:45 Re: pgsql: Prevent invalidation of newly synced replication slots.

Browse pgsql-hackers by date

  From Date Subject
Next Message Marco Nenciarini 2026-01-28 17:03:24 BUG: Cascading standby fails to reconnect after falling back to archive recovery
Previous Message Andres Freund 2026-01-28 16:26:51 Re: More speedups for tuple deformation