Quick Links

Re: Sketch of a fix for that truncation data corruption issue

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Sketch of a fix for that truncation data corruption issue
Date:	2018-12-12 01:54:15
Message-ID:	20181212015415.5pphghl3buuz2hob@alap3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2018-12-12 10:49:59 +0900, Robert Haas wrote:
> Just thinking about this a bit, the problem with truncating first and
> then writing the WAL record is that if the WAL record never makes it
> to disk, any physical standbys will end up out of sync with the
> master, leading to disaster. But the problem with writing the WAL
> record first is that the actual operation might fail, and then
> standbys will end up out of sync with the master, leading to disaster.
> The obvious way to finesse that latter problem is just PANIC if
> ftruncate() fails -- then we'll crash restart and retry, and if we
> still can't do it, well, the DBA will have to fix that before the
> system can come on line. I'm not sure that's really all that bad --
> if we can't truncate, we're kinda hosed. How, other than a
> permissions problem, does that even happen?

I think it's correct to panic in that situation. As you say it's really
unlikely for that to happen in normal circumstances (as long as we
handle obvious stuff like EINTR) - and added complexity to avoid it
seems very unlikely to be tested.

Greetings,

Andres Freund

In response to

Re: Sketch of a fix for that truncation data corruption issue at 2018-12-12 01:49:59 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2018-12-12 02:05:36	Re: Remove Deprecated Exclusive Backup Mode
Previous Message	Robert Haas	2018-12-12 01:49:59	Re: Sketch of a fix for that truncation data corruption issue