Re: .ready and .done files considered harmful

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: .ready and .done files considered harmful
Date: 2021-05-04 16:42:04
Message-ID: CA+TgmoZHBb-TmwYtAzoPYu2FW_NWyLOnABNz_U-=59wTJuz4ig@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 4, 2021 at 11:54 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> I agree that if we continue to archive one file using the archive
> command then Robert's solution of checking the existence of the next
> WAL segment (N+1) has an advantage. But, currently, if you notice
> pgarch_readyXlog always consider any history file as the oldest file
> but that will not be true if we try to predict the next WAL segment
> name. For example, if we have archived 000000010000000000000004 then
> next we will look for 000000010000000000000005 but after generating
> segment 000000010000000000000005, if there is a timeline switch then
> we will have the below files in the archive status
> (000000010000000000000005.ready, 00000002.history file). Now, the
> existing archiver will archive 00000002.history first whereas our code
> will archive 000000010000000000000005 first. Said that I don't see
> any problem with that because before archiving any segment file from
> TL 2 we will definitely archive the 00000002.history file because we
> will not find the 000000010000000000000006.ready and we will scan the
> full directory and now we will find 00000002.history as oldest file.

OK, that makes sense and is good to know.

> > > > However, that's still pretty wasteful. Every time we have to wait for
> > > > the next file to be ready for archiving, we'll basically fall back to
> > > > repeatedly scanning the whole directory, waiting for it to show up.
>
> Is this true? that only when we have to wait for the next file to be
> ready we got for scanning? If I read the code in
> "pgarch_ArchiverCopyLoop", for every single file to achieve it is
> calling "pgarch_readyXlog", wherein it scans the directory every time.
> So I did not understand your point that only when it needs to wait for
> the next .ready file it need to scan the full directory. It appeared
> it always scans the full directory after archiving each WAL segment.
> What am I missing?

It's not true now, but my proposal would make it true.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2021-05-04 16:43:47 Re: Extending amcheck to check toast size and compression
Previous Message Tom Lane 2021-05-04 16:32:34 Re: Incorrect snapshots while promoting hot standby node when 2PC is used