Re: txid failed epoch increment, again, aka 6291

From: Daniel Farina <daniel(at)heroku(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, uncleandyv(at)gmail(dot)com
Subject: Re: txid failed epoch increment, again, aka 6291
Date: 2012-09-07 18:47:11
Message-ID: CAAZKuFaBBLVPEWntfKZDEDEO5DKY5U2qj2C7JxS-BvhWj=VNkg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 7, 2012 at 5:49 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
> On Fri, Sep 07, 2012 at 01:37:57AM -0700, Daniel Farina wrote:
>> On Thu, Sep 6, 2012 at 3:04 AM, Noah Misch <noah(at)leadboat(dot)com> wrote:
>> > On Tue, Sep 04, 2012 at 09:46:58AM -0700, Daniel Farina wrote:
>> >> I might try to find the segments leading up to the overflow point and
>> >> try xlogdumping them to see what we can see.
>> >
>> > That would be helpful to see.
>> >
>> > Just to grasp at yet-flimsier straws, could you post (URL preferred, else
>> > private mail) the output of "objdump -dS" on your "postgres" executable?
>>
>> https://dl.dropbox.com/s/444ktxbrimaguxu/txid-wrap-objdump-dS-postgres.txt.gz
>
> Thanks. Nothing looks amiss there.
>
> I've attached the test harness I used to try reproducing this. It worked
> through over 500 epoch increments without a hitch; clearly, it fails to
> reproduce an essential aspect of your system. Could you attempt to modify it
> in the direction of better-resembling your production workload until it
> reproduces the problem?

Sure, I can mess around with it on our exact environment as well
(compilers, Xen, et al). We have not seen consistent reproduction
either -- most epochs seem to fail to increment (sample size: few, but
more than three) but epoch incrementing has happened more than zero
times for sure.

I wonder if we can rope in this guy, who is the only other report I've
seen of this:

http://lists.pgfoundry.org/pipermail/skytools-users/2012-March/001601.html

So I'm CCing him....

He seems to have reproduced it in 9.1, but I haven't seen his
operating system information on my very brief skim of that thread.

--
fdr

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2012-09-07 19:07:49 Re: Draft release notes complete
Previous Message Gezeala M. Bacuño II 2012-09-07 18:40:48 Re: [BUGS] BUG #7521: Cannot disable WAL log while using pg_dump