Re: BUG #15636: PostgreSQL 11.1 pg_basebackup backup to a CIFS destination throws fsync error at end of backup

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: John Klann <jk7255(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15636: PostgreSQL 11.1 pg_basebackup backup to a CIFS destination throws fsync error at end of backup
Date: 2019-02-18 16:23:39
Message-ID: 20190218162338.GG6197@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Greetings,

* Thomas Munro (thomas(dot)munro(at)enterprisedb(dot)com) wrote:
> On Sat, Feb 16, 2019 at 4:39 AM John Klann <jk7255(at)gmail(dot)com> wrote:
> > I was able to test the patch without issue and it appears to have worked see below. Let me know if there is further testing I can do or logging you would like.
>
> Thanks. So that leaves the question for the list: should we accept
> this as a bug and consider back-patching it? Our relationship with
> fsync() is ... complicated at the moment.

Yes, we should accept this as a bug and back-patch it, imv.

> Considering that we already
> tolerated EBADF for directories on some other (forgotten?) operating
> system, I think I would vote +1 for doing that ... if I didn't know
> that they'd fixed this on the Linux side, and I see now that they've
> back-patched that even to long term kernel 3.16:
>
> https://cdn.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.16.60

Which seems like it's just another good reason why we should also
back-patch it.

> PS I found an earlier discussion of this topic that didn't go
> anywhere: https://www.postgresql.org/message-id/flat/E1RhkjA-0005bq-B6%40wrigleys.postgresql.org

As does this.

I would be much more concerned about a change like this if we either
didn't know it was a directory that we got fsync() EINVAL for. Since we
know it's a directory and we know there's filesystems out there which do
synchronous metadata changes and we know they return EINVAL in those
cases, I'm inclined to say we should allow it. In the above thread, Tom
mentioned the standard, which pretty clearly distinguishes between "this
thing you gave to fsync() doesn't support synchronization" and "an I/O
error occured" and it's really the latter that we care about here.

Further, saying "just use --no-sync" is *not* a workaround. That's
throwing the baby out with the bathwater. A workaround would be an
option like "--do-not-complain-about-directory-fsync-einval", but
suggesting that all fsync's be removed is terrible.

Thanks!

Stephen

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Stephen Frost 2019-02-18 16:26:38 Re: BUG #15636: PostgreSQL 11.1 pg_basebackup backup to a CIFS destination throws fsync error at end of backup
Previous Message Tom Lane 2019-02-18 15:25:50 Re: BUG #15640: FATAL: XX000: cannot read pg_class without having selected a database