Re: New WAL code dumps core trivially on replay of bad data

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Subject: Re: New WAL code dumps core trivially on replay of bad data
Date: 2012-08-20 14:07:48
Message-ID: 201208201607.48821.andres@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Monday, August 20, 2012 04:04:52 PM Tom Lane wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> > On 18.08.2012 08:52, Amit kapila wrote:
> >> I think that missing check of total length has caused this problem.
> >> However now this check will be different.
> >
> > That check still exists, in ValidXLogRecordHeader(). However, we now
> > allocate the buffer for the whole record before that check, based on
> > xl_tot_len, if the record header is split across pages. The theory in
> > allocating the buffer is that a bogus xl_tot_len field will cause the
> > malloc() to fail, returning NULL, and we treat that the same as a broken
> > header.
>
> Uh, no, you misread it. xl_tot_len is *zero* in this example. The
> problem is that RecordIsValid believes xl_len (and backup block size)
> even when it exceeds xl_tot_len.
>
> > I think we need to delay the allocation of the record buffer. We need to
> > read and validate the whole record header first, like we did before,
> > before we trust xl_tot_len enough to call malloc() with it. I'll take a
> > shot at doing that.
>
> I don't believe this theory at all. Overcommit applies to writing on
> pages that were formerly shared with the parent process --- it should
> not have anything to do with malloc'ing new space. But anyway, this
> is not what happened in my example.
If the memory is big enough (128kb) it will be mmap'ed into place. In that
case overcommiting applies before the pages have been brought in.

Greetings,

Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2012-08-20 14:52:22 alter enum add value if not exists
Previous Message Tom Lane 2012-08-20 14:04:52 Re: New WAL code dumps core trivially on replay of bad data