Re: Fix Error Message for allocate_recordbuf() Failure

From: Shoaib Lari <slari(at)pivotal(dot)io>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fix Error Message for allocate_recordbuf() Failure
Date: 2016-07-12 17:23:42
Message-ID: CAMGd8b9uMhv_20y5+jysCqvCbdg0h3ceUcsdMET4fvvfc3-X5Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the feedback.

We had encountered this error message when allocating a record buf of 344
bytes. The message "record length 344 at %X/%X too long" along with the
comment /* We treat this as a "bogus data" condition */ masked the OOM
condition, implying an error in log record size calculation logic.

The actual allocation that failed was for 5 * Max(BLCKSZ, XLOG_BLCKSZ), a
much larger allocation.

Shoaib

On Sun, Jul 10, 2016 at 10:58 PM, Craig Ringer <craig(at)2ndquadrant(dot)com>
wrote:

>
>
> On 11 July 2016 at 12:04, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
> wrote:
>
>> On Sat, Jul 9, 2016 at 2:58 AM, Shoaib Lari <slari(at)pivotal(dot)io> wrote:
>> > Besides making the error message more informative, we had to modify
>> > allocate_recordbuf() to return the actual number of bytes that were
>> being
>> > allocated.
>>
>> - report_invalid_record(state, "record length %u at %X/%X too long",
>> - total_len,
>> - (uint32) (RecPtr >> 32), (uint32) RecPtr);
>> + report_invalid_record(state,
>> + "cannot allocate %u bytes for record
>> length %u at %X/%X",
>> + newSizeOut, total_len, (uint32) (RecPtr >>
>> 32),
>> + (uint32) RecPtr);
>>
>> It does not look like a good idea to me to complicate the interface of
>> allocate_recordbuf just to make more verbose one error message,
>> meaning that it basically a complain related to the fact that
>> palloc_extended(MCXT_ALLOC_NO_OOM) does not mention to the user the
>> size that it has tried to allocate before returning NULL. Do you have
>> a use case that would prove to be useful if this extra piece of
>> information is provided? Because it seems to me that we don't really
>> care if we know how much memory it has failed to allocate, we only
>> want to know that it *has* failed and take actions only based on that.
>>
>
> I actually find details of how much memory we tried to allocate to be
> really useful in other places that do emit it. Sometimes it's been an
> important clue when trying to figure out what's going on on a remote system
> with no or limited access. Even if it just tells me "we were probably
> incrementally allocating lots of small values since this failure is small"
> vs "this allocation is huge, look for something unusually large" or "this
> allocation is impossibly huge / some suspicious value look for memory
> corruption".
>
> I tend to agree with your suggestion about a better approach but do think
> emitting details on allocation failures is useful. At least when people
> turn VM overcommit off ...
>
> --
> Craig Ringer http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2016-07-12 17:31:50 Re: Showing parallel status in \df+
Previous Message Tom Lane 2016-07-12 17:10:36 Re: Showing parallel status in \df+