Re: logical replication: could not create file "state.tmp": File exists

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Grigory Smolkin <g(dot)smolkin(at)postgrespro(dot)ru>
Cc: Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: logical replication: could not create file "state.tmp": File exists
Date: 2019-12-02 04:35:47
Message-ID: 20191202043547.GE1696@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sat, Nov 30, 2019 at 03:09:39PM +0300, Grigory Smolkin wrote:
> I`ve digged a bit into this problem, and it`s turned out that in
> SaveSlotToPath() temp file for replication slot is opened with 'O_CREAT |
> O_EXCL' flags, which makes this routine as not very reentrant.

What did you see as I/O problem before facing the actual error
reported here? Was it just ENOSPC, a fsync failure, or just a failure
in closing the fd? The first pattern is mostly what I guess happened,
still a fsync failure would not trigger a PANIC here (actually we
really should do that!), but I am raising a different thread about
that issue.

> Since an exclusive lock is taken before temp file creation, I think it
> should be safe to replace O_EXCL with O_TRUNC.
> Script to reproduce and patch are attached.

Agreed. I prefer the O_TRUNC option because that's less code churn.
Also, as it can still be useful to have a look at the temporary state
file after a crash or a failure, doing unlink() in the error code
paths is no good option IMO.

Have others thoughts or objections to share?
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message EffiSYS / Martin Querleu 2019-12-02 09:20:30 Re: Strange query planner behavior
Previous Message Thomas Munro 2019-12-02 01:20:55 Re: FailedAssertion("!OidIsValid(def->collOid)", File: "view.c", Line: 89)