From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: subscriptionCheck failures on nightjar |
Date: | 2019-02-13 17:37:35 |
Message-ID: | 29708.1550079455@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2019-02-13 11:57:32 -0500, Tom Lane wrote:
>> I've managed to reproduce this locally, and obtained this PANIC:
> Cool. How exactly?
Andrew told me that nightjar is actually running in a qemu VM,
so I set up freebsd 9.0 in a qemu VM, and boom. It took a bit
of fiddling with qemu parameters, but for such a timing-sensitive
problem, that's not surprising.
>> Anyway, I think we might be able to fix this along the lines of
>> [ fsync the data before renaming not after ]
> Hm, but that's not the same? On some filesystems one needs the directory
> fsync, on some the file fsync, and I think both in some cases.
Now that I look at it, there's a pg_fsync() just above this, so
I wonder why we need a second fsync on the file at all. fsync'ing
the directory is needed to ensure the directory entry is on disk;
but the file data should be out already, or else the kernel is
simply failing to honor fsync.
>> The existing code here seems simply wacky/unsafe to me regardless
>> of this race condition: couldn't it potentially result in a corrupt
>> snapshot file appearing with a valid name, if the system crashes
>> after persisting the rename but before it's pushed the data out?
> What do you mean precisely with "before it's pushed the data out"?
Given the previous pg_fsync, this isn't an issue.
>> I also wonder why bother with the directory sync just before the
>> rename.
> Because on some FS/OS combinations the size of the renamed-into-place
> file isn't guaranteed to be durable unless the directory was
> fsynced.
Bleah. But in any case, the rename should not create a situation
in which we need to fsync the file data again.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2019-02-13 17:41:41 | Re: reducing isolation tests runtime |
Previous Message | Andres Freund | 2019-02-13 17:11:01 | Re: subscriptionCheck failures on nightjar |