Re: BUG #6200: standby bad memory allocations on SELECT

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Bridget Frey <bridget(dot)frey(at)redfin(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Brauwerman <michael(dot)brauwerman(at)redfin(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #6200: standby bad memory allocations on SELECT
Date: 2012-01-31 04:59:16
Message-ID: 25365.1327985956@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Bridget Frey <bridget(dot)frey(at)redfin(dot)com> writes:
> Thanks for the reply, we appreciate you time on this. The alloc error
> queries all seem to be selects from a btree primary index. I gave an
> example in my initial post from the logins table. Usually for us it
> is logins but sometimes we have seen it on a few other tables, and
> it's always a btree primary key index, very simple type of select.

Hm. The stack trace is definitive that it's finding the bad data in a
tuple that it's trying to print to the client, not in an index.
That tuple might've been straight from disk, or it could have been
constructed inside the backend ... but if it's a simple SELECT FROM
single-table WHERE index-condition then the tuple should be raw data
found in a shared buffer.

> The queries have been showing up in the logs which is how we know, but
> we could also confirm in the core dump. If the problem is data
> corruption, it is transient. We replay the same queries and get no
> errors.

The idea that comes to mind is that somehow btree index updates are
reaching the standby in advance of the heap updates they refer to.
But how could that be? And even more to the point, if we did follow
a bogus TID pointer from an index, how come it's failing there? You'd
expect it to usually notice such a problem much earlier, while examining
the heap tuple header. (Invalid xmin complaints are the typical symptom
from that, since the xmin is one of the first fields we look at that
can be sanity-checked to any extent.)

Still baffled here.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2012-01-31 05:05:56 Re: BUG #6200: standby bad memory allocations on SELECT
Previous Message Bridget Frey 2012-01-31 04:07:10 Re: BUG #6200: standby bad memory allocations on SELECT