Qestion about .partial WAL file

From: "Matsumura, Ryo" <matsumura(dot)ryo(at)jp(dot)fujitsu(dot)com>
To: "'pgsql-hackers(at)postgresql(dot)org'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Qestion about .partial WAL file
Date: 2019-04-11 00:32:21
Message-ID: 03040DFF97E6E54E88D3BFEE5F5480F737AE5C91@G01JPEXMBYT04
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Hackers

I noticed something strange. Does it cause nothing?
I didn't detect anything, but feel restless.

Step:
- There are two standbys that connect to primary.
- Kill primary and promote one standby.
- Restart another standby that is reset primary_conninfo to connect new primary.

I expected that the latest WAL segment file in old timeline is renamed with .partial suffix,
but it's not renamed in the restarted standby.

xlog.c says the following, but I didn't understand the bad situation.

* the archive. It's physically present in the new file with new TLI,
* but recovery won't look there when it's recovering to the older
--> * timeline. On the other hand, if we archive the partial segment, and
--> * the original server on that timeline is still running and archives
--> * the completed version of the same segment later, it will fail. (We
* used to do that in 9.4 and below, and it caused such problems).
*
* As a compromise, we rename the last segment with the .partial
* suffix, and archive it. Archive recovery will never try to read
* .partial segments, so they will normally go unused. But in the odd
* PITR case, the administrator can copy them manually to the pg_wal
* directory (removing the suffix). They can be useful in debugging,
* too.

Regards
Ryo Matsumura

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-04-11 00:48:35 Re: [HACKERS] Weaker shmem interlock w/o postmaster.pid
Previous Message David Rowley 2019-04-11 00:27:20 Re: Reducing the runtime of the core regression tests