Re: subscriptionCheck failures on nightjar

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: subscriptionCheck failures on nightjar
Date: 2019-02-13 18:12:25
Message-ID: 20190213181225.fathyapig4sm4exa@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-02-13 12:59:19 -0500, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2019-02-13 12:37:35 -0500, Tom Lane wrote:
> >> Bleah. But in any case, the rename should not create a situation
> >> in which we need to fsync the file data again.
>
> > Well, it's not super well defined which of either you need to make the
> > rename durable, and it appears to differ between OSs. Any argument
> > against fixing it up like I suggested, by using an fd from before the
> > rename?
>
> I'm unimpressed. You're speculating about the filesystem having random
> deviations from POSIX behavior, and using that weak argument to justify a
> totally untested technique having its own obvious portability hazards.

Uhm, we've reproduced failures due to the lack of such fsyncs at some
point. And not some fringe OS, but ext4 (albeit with data=writeback).

I don't think POSIX has yet figured out what they actually think is
required:
http://austingroupbugs.net/view.php?id=672

I guess we could just ignore ENOENT in this case, that ought to be just
as safe as using the old fd.

> Also, I wondered why this is coming out as a PANIC. I thought originally
> that somebody must be causing this code to run in a critical section,
> but it looks like the real issue is just that fsync_fname() uses
> data_sync_elevel, which is
>
> int
> data_sync_elevel(int elevel)
> {
> return data_sync_retry ? elevel : PANIC;
> }
>
> I really really don't want us doing questionably-necessary fsyncs with a
> PANIC as the result.

Well, given the 'failed fsync throws dirty data away' issue, I don't
quite see what we can do otherwise. But:

> Perhaps more to the point, the way this was coded, the PANIC applies
> to open() failures in fsync_fname_ext() not just fsync() failures;
> that's painting with too broad a brush isn't it?

That indeed seems wrong. Thomas? I'm not quite sure how to best fix
this though - I guess we could rename fsync_fname_ext's eleval parameter
to fsync_failure_elevel? It's not visible outside fd.c, so that'd not be
to bad?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-13 18:24:03 Re: subscriptionCheck failures on nightjar
Previous Message Tom Lane 2019-02-13 18:01:52 Re: reducing isolation tests runtime