RE: WAL archive is lost

From: "matsumura(dot)ryo(at)fujitsu(dot)com" <matsumura(dot)ryo(at)fujitsu(dot)com>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: 'Tomas Vondra' <tomas(dot)vondra(at)2ndquadrant(dot)com>, 'Jeff Janes' <jeff(dot)janes(at)gmail(dot)com>
Subject: RE: WAL archive is lost
Date: 2019-11-29 01:44:39
Message-ID: OSAPR01MB5027DB8A8C66313E27FFBCD3E8460@OSAPR01MB5027.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas-san and Jeff-san

I'm very sorry for my slow response.

Tomas-san wrote:
> That does indeed seem like a bug. We should certainly archive all WAL
> segments, irrespectedly of primary shutdowns/restarts/whatever.

I think so, too.

Tomas-san wrote:
> I guess we should make sure the archiver is properly notified befor
> ethe exit.

Just an idea.
If walrcv_receive(libpqrcv_receive) returns by error value when
socket error is occured, it is enable for walreceiver to walk
endofwal-route that calls XLogArchiveNotify() in the end of
outter loop of walreceiver.

593 XLogArchiveNotify(xlogfname);
594 }
595 recvFile = -1;
596
597 elog(DEBUG1, "walreceiver ended streaming and awaits new instructions");
598 Wal

Jeff-san wrote:
> Will it not archive 000000010000000000000002 eventually, like at the
> conclusion of the next restartpoint? or does it get recycled/removed
> without ever being archived? Or does it just hang out forever in pg_wal?

000000010000000000000002 hang out forever.
000000010000000000000002 will be never archived, recycled, and removed.

I found that even if archive_mode is not set to 'always',
it will be never recycled and removed.

Jeff-san wrote:
> Do you have a trick for reliably achieving this last step?

If possible, stop walsender just after it sends the end record of in one
WAL segement file or SWITCH_LOG, and then stop primary immediately.

There are two pattern that cause this issue.

Pattern 1.
If primary is shut down immediately when walreceiver receives the end
record of one WAL segment file and then wait for next record by walrcv_receive(),
walreceiver exits without XLogArchiveNotify() or XLogArchiveForceDone() in
XLogWalRcvWrite() because walrcv_receive() reports ERROR.
Even if the startup process restarts walreceiver and requests to start
from the top of next segement file. Then, walreceiver receives it and
writes by XLogWalRcvWrite() but it doesn't walk the route to XLogArchiveNotify()
because it has not opened any file (recvFile == -1).

Pattern 2.
Only trigger is different.
If primary is shut down immediately when walreceiver receives SWITCH_LOG
and then wait for next record by walrcv_receive(), walreceiver exits
without notification to archiver.
The startup process will tell for walreceiver to start receiving from
the top of next segment file.

Regards
Ryo Matsumura

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-11-29 02:11:25 Re: Proposal: Add more compile-time asserts to expose inconsistencies.
Previous Message Tatsuro Yamada 2019-11-29 01:27:20 Re: progress report for ANALYZE