| From: | Andres Freund <andres(at)anarazel(dot)de> |
|---|---|
| To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
| Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Amit Kapila <akapila(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: pgsql: Prevent invalidation of newly synced replication slots. |
| Date: | 2026-01-28 16:53:59 |
| Message-ID: | j3fa57s3im2wbuhz33cmbg56lgpbrtt25qq7irou336pawd2jo@lloejwmt3sxs |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-committers pgsql-hackers |
Hi,
On 2026-01-28 18:05:10 +0530, Amit Kapila wrote:
> I noticed that the previous test didn't quitted the background psql
> session used for concurrent checkpoint. By quitting that background
> session, the test passed for me consistently. See attached. It is
> written in comments atop background_psql: "Be sure to "quit" the
> returned object when done with it.". Now, this background session
> doesn't directly access the backup_label file but it could be
> accessing one of the parent directories where backup_label is present.
Hm. I've seen (and complained about [1]) weird errors when not shutting down
IPC::Run processes - mostly the test hanging at the end though.
> One of gen-AI says as follows: "In Windows, MoveFileEx (Error 32:
> ERROR_SHARING_VIOLATION) can fail if a process is accessing the file's
> parent directory in a way that creates a lock. While the error message
> usually points to the file itself, the parent folder is a critical
> part of the operation.".
I don't see how that could be the plausible reason - after all we have a lot
of other open files open in the relevant directories. But: It seems to fix
the problem for you, so it's worth going for it, as it's the right thing to do
anyway.
I think it'd be worth, separately from committing the workaround, trying to
figure out what's holding the file open. Andrey observed that the tests pass
for him with a much longer timeout. If you can reproduce it locally, I'd try
to use something like [2] to see what has handles open to the relevant files,
while waiting for the timeout.
Greetings,
Andres Freund
[1] https://postgr.es/m/20240619030727.ldp3mcrjbd5fqwj5%40awork3.anarazel.de
[2] https://learn.microsoft.com/en-us/sysinternals/downloads/handle
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Robert Haas | 2026-01-28 17:08:03 | pgsql: Allow for plugin control over path generation strategies. |
| Previous Message | Amit Kapila | 2026-01-28 15:00:45 | Re: pgsql: Prevent invalidation of newly synced replication slots. |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Marco Nenciarini | 2026-01-28 17:03:24 | BUG: Cascading standby fails to reconnect after falling back to archive recovery |
| Previous Message | Andres Freund | 2026-01-28 16:26:51 | Re: More speedups for tuple deformation |