Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Steve Kehlet <steve(dot)kehlet(at)gmail(dot)com>, Forums postgresql <pgsql-general(at)postgresql(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1
Date: 2015-06-05 10:51:53
Message-ID: CAEepm=37r3J0ZCGfOHPGF+qyp08d+LPv_4bnTG1sgnHR8RqHvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On Fri, Jun 5, 2015 at 1:47 PM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Fri, Jun 5, 2015 at 11:47 AM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> On Fri, Jun 5, 2015 at 9:29 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> Here's a new version with some more fixes and improvements:
>>> [...]
>>
>> With this patch, when I run the script
>> "checkpoint-segment-boundary.sh" from
>> http://www.postgresql.org/message-id/CAEepm=1_KbHGbmPVmkUGE5qTP+B4efoCJYS0unGo-Mc5NV=UDg@mail.gmail.com
>> I see the following during shutdown checkpoint:
>>
>> LOG: could not truncate directory "pg_multixact/offsets": apparent wraparound
>>
>> That message comes from SimpleLruTruncate.
>
> Suggested patch attached.

Is it a problem that we don't drop/forget page buffers from the
members SLRU (unlike SimpleLruTruncate, which is used for the offsets
SLRU)?

I may be missing something but it seems to me that it isn't, because
(1) CheckPointMultiXact is called to flush any dirty pages to disk
before TruncateMultiXact is called and (2) no pages older than the one
holding the oldest offset should be dirtied after CheckPointMultiXact
runs (member space is 'append only', at least until it is recycled),
so any pages in the SLRU whose underlying file has been truncated
should just naturally fall out of the LRU slots. So they can't create
problems by being written to disk after the unlink.

--
Thomas Munro
http://www.enterprisedb.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Grittner 2015-06-05 13:20:25 Re: Planner cost adjustments
Previous Message Noah Misch 2015-06-05 06:20:17 Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2015-06-05 11:51:52 pg_stat_*_columns?
Previous Message Simon Riggs 2015-06-05 10:18:13 Re: Multixid hindsight design