Re: 9.4 failure on skink in _bt_newroot/XLogCheckBuffer

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 9.4 failure on skink in _bt_newroot/XLogCheckBuffer
Date: 2016-05-23 00:32:22
Message-ID: 20160523003222.hhzbplll3nfj55ix@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi tom,

On 2016-05-21 17:18:14 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > The valgrind animal just reported a large object related failure on 9.4:
>
> The proximate cause seems to be that _bt_newroot isn't bothering to
> fill the buffer_std field here:
>
> /* Make a full-page image of the left child if needed */
> rdata[2].data = NULL;
> rdata[2].len = 0;
> rdata[2].buffer = lbuf;
> rdata[2].next = NULL;
>
> which is indeed an actual bug, but the only consequence would be poor
> compression of the full-page image (if the value chanced to be zero),
> so it's not much of a problem.

Thanks for fixing that one!

> What remains unclear is how come this only fails once in a blue moon.
> Seems like any valgrind run of the regression tests should have caught it.

Looks like a timing issue. The relevant access to the uninitialized
buffer_std field only happens when
if (*lsn <= RedoRecPtr)
{
which presumably is not that likely to be hit. Even under valgrind the
individual tests are likely to finish below a checkpoint timeout.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-05-23 00:53:27 Changed SRF in targetlist handling
Previous Message Teodor Sigaev 2016-05-22 22:53:45 Re: Adding an alternate syntax for Phrase Search