Re: [bug fix] Cascaded standby cannot start after a clean shutdown

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] Cascaded standby cannot start after a clean shutdown
Date: 2018-02-23 02:26:31
Message-ID: 20180223022631.GA15131@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 22, 2018 at 04:55:38PM +0900, Michael Paquier wrote:
> I am definitely ready to buy that it can be possible to have garbage
> being read the length field which can cause allocate_recordbuf to fail
> as that's the only code path in xlogreader.c which does such an
> allocation. Still, it seems to me that we should first try to see if
> there are strange allocation patterns that happen and see if it is
> possible to have a reproduceable test case or a pattern which gives us
> confidence that we are on the right track. One idea I have to
> monitor those allocations like the following:
> --- a/src/backend/access/transam/xlogreader.c
> +++ b/src/backend/access/transam/xlogreader.c
> @@ -162,6 +162,10 @@ allocate_recordbuf(XLogReaderState *state, uint32 reclength)
> newSize += XLOG_BLCKSZ - (newSize % XLOG_BLCKSZ);
> newSize = Max(newSize, 5 * Max(BLCKSZ, XLOG_BLCKSZ));
>
> +#ifndef FRONTEND
> + elog(LOG, "Allocation for xlogreader increased to %u", newSize);
> +#endif

So, I have been playing a bit more with that and defined the following
strategy to see if it is possible to create inconsistencies:
- Use a primary and a standby.
- Set up max_wal_size and min_wal_size to a minimum of 80MB so as the
segment recycling takes effect more quickly.
- Create a single table with a UUID column to increase the likelihood of
random data in INSERT records and FPWs, and insert enough data to
trigger a full WAL recycling.
- Every 5 seconds, insert a set of tuples into the table, using 110 to
120 tuples generates enough data for a bit more than a full WAL page.
And then restart the primary. This causes the standby to catch up with
normally a page streamed which is not completely initialized as it
fetches the page in the middle.

With the monitoring mentioned in the upper comment block, I have let the
whole thing run for a couple of hours, but I have not been able to catch
up problems, except the usual "invalid record length at 0/XXX: wanted
24, got 0". The allocation for recordbuf did not get higher than 40960
bytes as well, which matches with 5 WAL pages.

An other, evil, idea that I have on top of all those things is to
directly hexedit the WAL segment of the standby just at the limit where
it would receive a record from the primary and insert in it garbage
data which would make the validation tests to blow up in xlogreader.c
for the record allocation.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2018-02-23 02:29:26 Re: [HACKERS] SERIALIZABLE with parallel query
Previous Message Ahuja, Nitin 2018-02-23 02:13:08 Patch: Pass IndexInfo correctly to aminsert for indexes on TOAST