Re: archive status ".ready" files may be created too early

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: "masao(dot)fujii(at)gmail(dot)com" <masao(dot)fujii(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: archive status ".ready" files may be created too early
Date: 2020-03-26 18:50:24
Message-ID: 45E14987-0E25-4E5D-BDA5-B94EDA28A778@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Sorry for the long delay.

I've finally gotten to a new approach that I think is promising. My
previous attempts to fix this within XLogWrite() or within the
associated code paths all seemed to miss corner cases or to add far
too much complexity. The new proof-of-concept patch that I have
attached is much different. Instead of trying to adjust the ready-
for-archive logic in the XLogWrite() code paths, I propose relocating
the ready-for-archive logic to a separate process.

The v3 patch is a proof-of-concept patch that moves the ready-for-
archive logic to the WAL writer process. We mark files as ready-for-
archive when the WAL flush pointer has advanced beyond a known WAL
record boundary. In this patch, I am using the WAL insert location as
the known WAL record boundary. The main idea is that it should be
safe to archive a segment once we know the last WAL record for the WAL
segment, which may overflow into the following segment, has been
completely written to disk.

There are many things missing from this proof-of-concept patch that
will need to be handled if this approach seems reasonable. For
example, I haven't looked into any adjustments needed for the
archive_timeout parameter, I haven't added a way to persist the
"latest segment marked ready-for-archive" through crashes, I haven't
tried reducing the frequency of retrieving the WAL locations, and I'm
not sure the WAL writer process is even the right location for this
logic. However, these remaining problems all seem tractable to me.

I would appreciate your feedback on whether you believe this approach
is worth pursuing.

Nathan

Attachment Content-Type Size
v3-0001-Avoid-marking-WAL-segments-as-ready-for-archive-t.patch application/octet-stream 3.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-03-26 18:50:56 Re: Berserk Autovacuum (let's save next Mandrill)
Previous Message Andres Freund 2020-03-26 18:49:47 Re: plan cache overhead on plpgsql expression