From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: standby server crashes hard on out-of-disk-space in HEAD |
Date: | 2017-08-29 06:34:24 |
Message-ID: | CAB7nPqQKFvRY4A9jhmKGsmWgR7qF-_kEBfbeRpPBypwOCuy+kw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Aug 28, 2017 at 9:06 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Tue, Jun 13, 2017 at 4:21 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> On 2017-06-12 15:12:23 -0400, Robert Haas wrote:
>>> Commit 4b4b680c3d6d8485155d4d4bf0a92d3a874b7a65 (Make backend local
>>> tracking of buffer pins memory efficient., vintage 2014) seems like a
>>> likely culprit here, but I haven't tested.
>>
>> I'm not that sure. As written above, the Assert isn't new, and given
>> this hasn't been reported before, I'm a bit doubtful that it's a general
>> refcount tracking bug. The FPI code has been whacked around more
>> heavily, so it could well be a bug in it somewhere.
>
> Something doing a bisect could just use a VM that puts the standby on
> a tiny partition. I remember seeing this assertion failure some time
> ago on a test deployment, and that was really surprising. I think that
> this may be hiding something, so we should really try to investigate
> more what's wrong here.
I have been playing a bit with the builds and the attached script
triggering out-of-space errors on a standby (adapt to your
environment), and while looking for a good commit, I have found that
this thing is a bit older than the 2014 vintage... Down to the
merge-base of REL9_4_STABLE and REL9_3_STABLE, the assertion failure
is still the same.
The failure is older than even 9.2, for example by testing at the
merge-base of 9.2 and 9.3:
CONTEXT: xlog redo insert(init): rel 1663/16384/16385; tid 181441/1
TRAP: FailedAssertion("!(PrivateRefCount[i] == 0)", File: "bufmgr.c",
Line: 1788)
But well this assertion got changed in dcafdbcd.
--
Michael
Attachment | Content-Type | Size |
---|---|---|
crash_standby.bash | application/octet-stream | 2.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2017-08-29 06:58:51 | Re: MAIN, Uncompressed? |
Previous Message | Masahiko Sawada | 2017-08-29 06:00:57 | Re: show "aggressive" or not in autovacuum logs |