Quick Links

Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

From:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To:	Noah Misch <noah(at)leadboat(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Steve Kehlet <steve(dot)kehlet(at)gmail(dot)com>, Forums postgresql <pgsql-general(at)postgresql(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Date:	2015-06-03 23:08:40
Message-ID:	CAEepm=1_KbHGbmPVmkUGE5qTP+B4efoCJYS0unGo-Mc5NV=UDg@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-hackers

On Mon, Jun 1, 2015 at 4:55 PM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> While testing this (with inconsistent-multixact-fix-master.patch applied,
> FWIW), I noticed a nearby bug with a similar symptom. TruncateMultiXact()
> omits the nextMXact==oldestMXact special case found in each other
> find_multixact_start() caller, so it reads the offset of a not-yet-created
> MultiXactId. The usual outcome is to get rangeStart==0, so we truncate less
> than we could. This can't make us truncate excessively, because
> nextMXact==oldestMXact implies no table contains any mxid. If nextMXact
> happens to be the first of a segment, an error is possible. Procedure:
>
> 1. Make a fresh cluster.
> 2. UPDATE pg_database SET datallowconn = true
> 3. Consume precisely 131071 mxids. Number of offsets per mxid is unimportant.
> 4. vacuumdb --freeze --all
>
> Expected state after those steps:
> $ pg_controldata | grep NextMultiXactId
> Latest checkpoint's NextMultiXactId: 131072
>
> Checkpoint will fail like this:
> 26699 2015-05-31 17:22:33.134 GMT LOG: statement: checkpoint
> 26661 2015-05-31 17:22:33.134 GMT DEBUG: performing replication slot checkpoint
> 26661 2015-05-31 17:22:33.136 GMT ERROR: could not access status of transaction 131072
> 26661 2015-05-31 17:22:33.136 GMT DETAIL: Could not open file "pg_multixact/offsets/0002": No such file or directory.
> 26699 2015-05-31 17:22:33.234 GMT ERROR: checkpoint request failed
> 26699 2015-05-31 17:22:33.234 GMT HINT: Consult recent messages in the server log for details.
> 26699 2015-05-31 17:22:33.234 GMT STATEMENT: checkpoint
>
> This does not block startup, and creating one mxid hides the problem again.
> Thus, it is not a top-priority bug like some other parts of this thread. I
> mention it today mostly so it doesn't surprise hackers testing other fixes.

Thanks. As mentioned elsewhere in the thread, I discovered that the
same problem exists for page boundaries, with a different error
message. I've tried the attached repro scripts on 9.3.0, 9.3.5, 9.4.1
and master with the same results:

FATAL: could not access status of transaction 2048
DETAIL: Could not read from file "pg_multixact/offsets/0000" at
offset 8192: Undefined error: 0.

FATAL: could not access status of transaction 131072
DETAIL: Could not open file "pg_multixact/offsets/0002": No such file
or directory.

But, yeah, this isn't the bug we're looking for.

--
Thomas Munro
http://www.enterprisedb.com

Attachment	Content-Type	Size
checkpoint-page-boundary.sh	application/x-sh	672 bytes
checkpoint-segment-boundary.sh	application/x-sh	674 bytes

In response to

Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1 at 2015-06-01 04:55:34 from Noah Misch

Responses

Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1 at 2015-06-16 18:58:44 from Alvaro Herrera

Browse pgsql-general by date

	From	Date	Subject
Next Message	Jan de Visser	2015-06-04 02:02:35	Re: Database designpattern - product feature
Previous Message	Alvaro Herrera	2015-06-03 22:05:44	Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2015-06-04 00:34:57	Re: [PATCH] Add error handling to byteaout.
Previous Message	Jim Nasby	2015-06-03 23:07:38	Re: pg_xlog -> pg_xjournal?