Re: .ready and .done files considered harmful

From: Dipesh Pandit <dipesh(dot)pandit(at)gmail(dot)com>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Hannu Krosing <hannuk(at)google(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-08-19 12:39:53
Message-ID: CAN1g5_EgkiSAd2Hh_T_bC9Z+GmdQTEhR=eb=edh8aKAiLDnysQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for the feedback.

> Should we have XLogArchiveNotify(), writeTimeLineHistory(), and
> writeTimeLineHistoryFile() enable the directory scan instead? Else,
> we have to exhaustively cover all such code paths, which may be
> difficult to maintain. Another reason I am bringing this up is that
> my patch for adjusting .ready file creation [0] introduces more
> opportunities for .ready files to be created out-of-order.

XLogArchiveNotify() notifies Archiver when a log segment is ready for
archival by creating a .ready file. This function is being called for each
log segment and placing a call to enable directory scan here will result
in directory scan for each log segment.

We can have writeTimeLineHistory() and writeTimeLineHistoryFile() to
enable directory scan to handle the scenarios related to timeline switch.

However, in other scenarios, I think we have to explicitly call
PgArchEnableDirScan()
to enable directory scan. PgArchEnableDirScan() takes care of waking up
archiver so that the caller of this function need not have to nudge the
archiver.

> + /*
> + * This is a fall-back path, check if we are here due to the
unavailability
> + * of next anticipated log segment or the archiver is being forced to
> + * perform a full directory scan. Reset the flag in shared memory
only if
> + * it has been enabled to force a full directory scan and then
proceed with
> + * directory scan.
> + */
> + if (PgArch->dirScan)
> + PgArch->dirScan = false;

> Why do we need to check that the flag is set before we reset it? I
> think we could just always reset it since we are about to do a
> directory scan anyway

Yes, I agree.

> > If there is a race condition here with setting the flag, then an
> > alternative design would be to use a counter - either a plain old
> > uint64 or perhaps pg_atomic_uint64 - and have the startup process
> > increment the counter when it wants to trigger a scan. In this design,
> > the archiver would never modify the counter itself, but just remember
> > the last value that it saw. If it later sees a different value it
> > knows that a full scan is required. I think this kind of system is
> > extremely robust against the general class of problems that you're
> > talking about here, but I'm not sure whether we need it, because I'm
> > not sure whether there is a race with just the bool.

> I'm not sure, either. Perhaps it would at least be worth adding a
> pg_memory_barrier() after setting dirScan to false to avoid the
> scenario I mentioned (which may or may not be possible). IMO this
> stuff would be much easier to reason about if we used a lock instead,
> even if the synchronization was not strictly necessary. However, I
> don't want to hold this patch up too much on this point.

There is one possible scenario where it may run into a race condition. If
archiver has just finished archiving all .ready files and the next
anticipated
log segment is not available then in this case archiver takes the fall-back
path to scan directory. It resets the flag before it begins directory scan.
Now, if a directory scan is enabled by a timeline switch or .ready file
created
out of order in parallel to the event that the archiver resets the flag
then this
might result in a race condition. But in this case also archiver is
eventually
going to perform a directory scan and the desired file will be archived as
part
of directory scan. Apart of this I can't think of any other scenario which
may
result into a race condition unless I am missing something.

I have incorporated the suggestions and updated a new patch. PFA patch v9.

Thanks,
Dipesh

Attachment Content-Type Size
v9-0001-mitigate-directory-scan-for-WAL-archiver.patch text/x-patch 10.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2021-08-19 12:47:42 Re: NAMEDATALEN increase because of non-latin languages
Previous Message Andres Freund 2021-08-19 12:20:52 Re: elog.c query_id support vs shutdown