Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Steve Kehlet <steve(dot)kehlet(at)gmail(dot)com>, Forums postgresql <pgsql-general(at)postgresql(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Date: 2015-06-02 15:36:56
Message-ID: 20150602153656.GR30287@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On 2015-06-02 11:29:24 -0400, Robert Haas wrote:
> On Tue, Jun 2, 2015 at 8:56 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > But what *definitely* looks wrong to me is that a TruncateMultiXact() in
> > this scenario now (since a couple weeks ago) does a
> > SimpleLruReadPage_ReadOnly() in the members slru via
> > find_multixact_start(). That just won't work acceptably when we're not
> > yet consistent. There very well could not be a valid members segment at
> > that point? Am I missing something?
>
> Yes: that code isn't new.

Good point.

> TruncateMultiXact() called SimpleLruReadPage_ReadOnly() directly in
> 9.3.0 and every subsequent release until 9.3.7/9.4.2.

But back then TruncateMultiXact() wasn't called during recovery. But
you're right in that it didn't seem to have reproduced attributable
bugreprorts since 9.3.2 where vacuuming during recovery was
introduced. So it indeed doesn't seem as urgent as fixing the new
callsites.

> That would be a departure from the behavior of every existing release
> that includes this code based on, to my knowledge, zero trouble
> reports.

On the other hand we're now at about bug #5 attributeable to the odd way
truncation works for multixacts. It's obviously complex and hard to get
right. It makes it harder to cope with the wrong values left in
datminxid etc. So I'm still wondering whether fixing this for good isn't
the better approach.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Robert Haas 2015-06-02 15:37:02 Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Previous Message William Dunn 2015-06-02 15:35:58 Re: Database designpattern - product feature

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-06-02 15:37:02 Re: [HACKERS] Re: 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Previous Message Tomas Vondra 2015-06-02 15:34:17 Re: nested loop semijoin estimates