Re: BUG #15636: PostgreSQL 11.1 pg_basebackup backup to a CIFS destination throws fsync error at end of backup

From: John Klann <jk7255(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15636: PostgreSQL 11.1 pg_basebackup backup to a CIFS destination throws fsync error at end of backup
Date: 2019-02-14 22:52:14
Message-ID: CAHyX5+VO-UTE-pK2i1rWvweU2CpVtHuhXnok29Z66EpLHmiHvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

+adding back to thread

On Thu, Feb 14, 2019 at 5:49 PM John Klann <jk7255(at)gmail(dot)com> wrote:

> Thanks Thomas that certainly makes sense considering the commit comment as
> well: "Directory operations under CIFS/SMB2/SMB3 are synchronous, so
> fsync()".
>
> I think will probably not use the --no-sync option then and handle for the
> EINVAL messages.
>
>
> On Thu, Feb 14, 2019 at 5:31 PM Thomas Munro <
> thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>
>> Hi John,
>> Unfortunately that applies only to directories, not to regular files
>> inside them.
>> --no-sync should get you past the problem as a workaround for now, but
>> then if your server(s) crash/lose power in the following seconds the
>> data might not be on disk. Fsync just means that the program doesn't
>> return until the data is flushed, but within some short period of time
>> the data will be flushed anyway, and power failure in the short window
>> before that is really unlikely, it's more a theoretical issue that we
>> database hackers like to worry about.
>>
>> On Fri, Feb 15, 2019 at 11:22 AM John Klann <jk7255(at)gmail(dot)com> wrote:
>> >
>> > Ah thank you this all makes sense.
>> >
>> > If CIFS is synchronous and and fsync'ing is not necessary then running
>> with the --no-sync option should be safe and possibly more performant
>> correct?
>> >
>> > John
>> >
>> > On Thu, Feb 14, 2019 at 5:06 PM Thomas Munro <
>> thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> >>
>> >> On Fri, Feb 15, 2019 at 10:15 AM PG Bug reporting form
>> >> <noreply(at)postgresql(dot)org> wrote:
>> >> > pg_basebackup: could not fsync file
>> >> > "/cifs/backups/<backupDirectoryName>/basebkp/base/1": Invalid
>> argument
>> >> > pg_basebackup: could not fsync file
>> >>
>> >> Hmm, it looks like your system gives EINVAL when you try to fsync a
>> >> directory. Perhaps we should teach fsync__fname() about that here:
>> >>
>> >> /*
>> >> * Some OSes don't allow us to fsync directories at all, so we
>> >> can ignore
>> >> * those errors. Anything else needs to be reported.
>> >> */
>> >> if (returncode != 0 && !(isdir && errno == EBADF))
>> >> {
>> >> fprintf(stderr, _("%s: could not fsync file \"%s\":
>> %s\n"),
>> >> progname, fname, strerror(errno));
>> >> (void) close(fd);
>> >> return -1;
>> >> }
>> >>
>> >> EINVAL actually makes more sense to me than EBADF for a filesystem
>> >> that can't fsync directories. From POSIX: EINVAL = "The fildes
>> >> argument does not refer to a file on which this operation is
>> >> possible." vs EBADF "The fildes argument is not a valid descriptor."
>> >> It *is* a valid descriptor, it's just not a valid operation
>> >> (apparently).
>> >>
>> >> Quick googling on the topic tells me that CIFS directory operations
>> >> are "synchronous", so fsync'ing isn't necessary. However, they only
>> >> made it silently do nothing in a recent version:
>> >>
>> >>
>> https://github.com/torvalds/linux/commit/6e70c267e68d77679534dcf4aaf84e66f2cf1425
>> >>
>> >> Presumably before that you get EINVAL because there is no handler
>> >> registered. The commit message even mentions that this was breaking
>> >> stuff like us.
>> >>
>> >> --
>> >> Thomas Munro
>> >> http://www.enterprisedb.com
>>
>>
>>
>> --
>> Thomas Munro
>> http://www.enterprisedb.com
>>
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-02-14 23:05:00 Re: Segmentation Fault in logical decoding get/peek API
Previous Message Jeremy Finzel 2019-02-14 22:42:04 Re: Segmentation Fault in logical decoding get/peek API