|From:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>|
|To:||Michael Paquier <michael(at)paquier(dot)xyz>|
|Cc:||TipTop Labs <office(at)tiptop-labs(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>|
|Subject:||Re: BUG #14999: pg_rewind corrupts control file global/pg_control|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Michael Paquier <michael(at)paquier(dot)xyz> writes:
> So after that I falled back to your patch and began testing it, which is
> where I noticed that we can *never* give the insurance to recover a data
> folder on which an error has happened in the middle of a pg_rewind. The
> reason for that is quite simple: even if the truncation has been moved
> down to the moment where the first chunk of a file is received, you may
> have already done work on some relation files. Particularly, some of
> them may have been truncated down to a given size without a new range of
> blocks fetched from the source. So the data folder would be in an
> inconsistent state if trying to rewind it again.
Yes, we certainly cannot guarantee that failure partway through pg_rewind
leaves a consistent state of the target data directory. It is likely
worth pointing that out in the documentation. Whether we can or should
do anything about it is a different question.
When I first started looking at this thread, I wondered if maybe somebody
had had in mind to create an active defense against starting a postmaster
in an inconsistent target cluster, by dint of intentionally truncating
pg_control before the transfer starts and not making it valid again till
the very end. It's now clear from looking at the code that that's not
what's going on :-(. But I wonder how hard it would be to make it so,
and whether that'd be worth doing if it's not too hard.
Actually, probably a safer way to attack that would be to remove or
rename the topmost PG_VERSION file, and then put it back afterwards.
That'd be far easier to recover from manually, if need be, than
In any case, that seems separate from the question of what to do with
read-only files in the data directory. Should we push forward with
committing Michael's previous patch, and leave that issue for later?
regards, tom lane
|Next Message||PG Bug reporting form||2018-04-04 23:41:48||BUG #15143: Window Functions – Paranthese not allowed before OVER term|
|Previous Message||Andrew Gierth||2018-04-04 16:52:24||Re: BUG #15142: ERROR: MultiXactId nnnnn has not been created yet -- apparent wraparound in v9.5|
|Next Message||Peter Geoghegan||2018-04-04 18:53:56||Re: pgsql: New files for MERGE|
|Previous Message||Andres Freund||2018-04-04 18:46:32||Re: pgsql: New files for MERGE|