| From: | Mats Kindahl <mats(dot)kindahl(at)gmail(dot)com> |
|---|---|
| To: | Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, japinli(at)hotmail(dot)com |
| Cc: | suryapoondla4(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: pg_rewind does not rewind diverging timelines |
| Date: | 2026-06-02 02:13:45 |
| Message-ID: | deb64462-9259-4e1b-af96-8b4678b773a4@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 6/1/26 08:30, Kyotaro Horiguchi wrote:
> Sorry, I only just noticed this thread.
>
> I may be missing something, but UUID feels somewhat heavyweight to me
> for this problem.
>
> I wonder whether strengthening the history-based matching would be
> sufficient instead. If timelines with the same TLI but different
> histories can be treated as distinct and pg_rewind continues walking
> the history chain until it finds a common ancestor, that seems like a
> fairly natural fit with the existing timeline model.
> UUIDs would certainly make identification straightforward, although
> they would also introduce longer identifiers that are a bit less
> convenient for humans to work with. My initial thought is that it may
> be worth exploring how far we can get with the existing history
> information before introducing a new identifier.
It is a good idea, but unfortunately there are positions in the timeline
that have same TLI, same LSN, but are still different timelines because
they originate from different promotions.
Just to summarize the situation: the timeline history file contains a
TLI (which is a number), and a switchpoint (which is an LSN). Each time
pg_promote is called, a new timeline is created based on the previous
TLI (it is increased by 1) and the LSN at that point. (The actual
history file is written by StartupXLOG, not by pg_promote, but
pg_promote triggers the process by writing a marker file.)
If two servers go through the same sequence, e.g., start at the same
timeline, does a promote, and write same length but different data
(e.g., add a line to a table, but with different contents), they might
end up with same TLI, same LSN, but different pg_promote calls, and
different database contents, hence it is not possible to distinguish them.
LSNs are usually different, so it is not a very likely scenario, but it
is still there.
The UUID is just generated and written when pg_promote is called, which
is not very often, hence does not affect the server and replication very
often. Note that the UUID is _not_ in the EOR (EndOfRecovery) record,
just in the timeline history file.
Best wishes,
Mats Kindahl
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Zhijie Hou (Fujitsu) | 2026-06-02 02:13:57 | RE: [Patch] Fix check_pub_rdt bypass when origin is set in same ALTER SUBSCRIPTION |
| Previous Message | Peter Smith | 2026-06-02 02:10:23 | DOCS - missing SGML markup in some ALTER PUBLICATION examples |