Re: Cygwin cleanup

From: Andres Freund <andres(at)anarazel(dot)de>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andrew Dunstan <andrew(at)dunslane(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Cygwin cleanup
Date: 2023-01-24 00:36:22
Message-ID: 20230124003622.rg6trhbfhnxmnprj@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-01-23 17:28:14 -0600, Justin Pryzby wrote:
> On Thu, Jan 12, 2023 at 10:17:55PM -0600, Justin Pryzby wrote:
> > On Thu, Jan 12, 2023 at 06:43:54PM -0800, Andres Freund wrote:
> > > > It looks like logical decoding may be the "most wrong" place that
> > > > wal_sync_method is being used, so maybe my change is reasonable to
> > > > consider, and not just a workaround.
> > >
> > > I don't follow. What does using fsync_fname() vs fsync_fname_ext() have to do
> > > with pg_fsync() using wal_sync_method? fsync_fname() is just a wrapper around
> > > fsync_fname_ext(). Both end up in pg_fsync().
> >
> > My patch used fsync_fname_ext() which would cause an ERROR rather than a
> > PANIC when failing to fsync the logical decoding pathname.
> >
> > > Are you actually proposing that we don't PANIC after an fsync for the category
> > > of files that you list here, even with data_sync_retry set?
> >
> > Yes, but I'm referring only to my change to SnapBuildSerialize().
> >
> > The rest of the verbage was me trying to figure out the
> > history/evolution of pg_fsync usage.
>
> Also note the existing comment (originating from Thomas' "fsync-gate"
> commit, which introduced data_sync_retry):
>
> + * It's safe to just ERROR on fsync() here because we'll retry the whole
> + * operation including the writes.
>
> Also, it seems to work fine if one calls pg_fsync() again, rather than
> calling fsync_fname(), which re-opens the file.

I don't think that'd achieve the same thing necessarily. But it's notoriously
hard to know what which OS requires in this area.

> It also seems to work fine if one omits the initial call to
> fsync_fname("pg_logical/snapshots", true);

I don't think it's a good idea randomly weaken individual fsyncs just because
it somehow, without any theory as to how, fixes tests on cygwin.

> Since SnapBuildSerialize() isn't atomic (the system could crash at any
> point), I'm not seeing why these wouldn't be adequately safe.

I'm not sure what you mean by that. It's atomic from a crash safety view: It
first writes into a tempfile, fsyncs that + directory, then renames the file
into place, fsyncs new filename + directory again. Tempfiles are removed after
a crash. In case of a crash you can either end up with an "old" or a "new"
file.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2023-01-24 00:45:19 Re: Time delayed LR (WAS Re: logical replication restrictions)
Previous Message Peter Smith 2023-01-24 00:28:28 Re: Logical replication timeout problem