Re: subscriptionCheck failures on nightjar

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: subscriptionCheck failures on nightjar
Date: 2019-02-13 17:59:19
Message-ID: 30608.1550080759@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2019-02-13 12:37:35 -0500, Tom Lane wrote:
>> Bleah. But in any case, the rename should not create a situation
>> in which we need to fsync the file data again.

> Well, it's not super well defined which of either you need to make the
> rename durable, and it appears to differ between OSs. Any argument
> against fixing it up like I suggested, by using an fd from before the
> rename?

I'm unimpressed. You're speculating about the filesystem having random
deviations from POSIX behavior, and using that weak argument to justify a
totally untested technique having its own obvious portability hazards.
Who's to say that an fsync on a file opened before a rename is going to do
anything good after the rename? (On, eg, NFS there are obvious reasons
why it might not.)

Also, I wondered why this is coming out as a PANIC. I thought originally
that somebody must be causing this code to run in a critical section,
but it looks like the real issue is just that fsync_fname() uses
data_sync_elevel, which is

int
data_sync_elevel(int elevel)
{
return data_sync_retry ? elevel : PANIC;
}

I really really don't want us doing questionably-necessary fsyncs with a
PANIC as the result. Perhaps more to the point, the way this was coded,
the PANIC applies to open() failures in fsync_fname_ext() not just fsync()
failures; that's painting with too broad a brush isn't it?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-13 18:01:52 Re: reducing isolation tests runtime
Previous Message Andres Freund 2019-02-13 17:46:50 Re: reducing isolation tests runtime