Re: subscriptionCheck failures on nightjar

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: subscriptionCheck failures on nightjar
Date: 2019-02-13 17:41:51
Message-ID: 20190213174151.mfylkessxmapt4io@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-02-13 12:37:35 -0500, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2019-02-13 11:57:32 -0500, Tom Lane wrote:
> >> I've managed to reproduce this locally, and obtained this PANIC:
>
> > Cool. How exactly?
>
> Andrew told me that nightjar is actually running in a qemu VM,
> so I set up freebsd 9.0 in a qemu VM, and boom. It took a bit
> of fiddling with qemu parameters, but for such a timing-sensitive
> problem, that's not surprising.

Ah.

> >> I also wonder why bother with the directory sync just before the
> >> rename.
>
> > Because on some FS/OS combinations the size of the renamed-into-place
> > file isn't guaranteed to be durable unless the directory was
> > fsynced.
>
> Bleah. But in any case, the rename should not create a situation
> in which we need to fsync the file data again.

Well, it's not super well defined which of either you need to make the
rename durable, and it appears to differ between OSs. Any argument
against fixing it up like I suggested, by using an fd from before the
rename?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-02-13 17:46:50 Re: reducing isolation tests runtime
Previous Message Tom Lane 2019-02-13 17:41:41 Re: reducing isolation tests runtime