Re: DBT-3 with SF=20 got failed

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DBT-3 with SF=20 got failed
Date: 2015-09-08 21:02:27
Message-ID: 55EF4CE3.80108@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/08/2015 08:44 PM, Robert Haas wrote:
> On Tue, Sep 8, 2015 at 8:28 AM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
>>> Hello KaiGai-san,
>>>
>>> I've discovered a bug in the proposed patch - when resetting the hash
>>> table after the first batch, it does this:
>>>
>>> memset(hashtable->buckets, 0, sizeof(nbuckets * sizeof(HashJoinTuple)));
>>>
>>> The first 'sizeof' is bogus, so this only zeroes the first 8 bytes of
>>> the array (usually resulting in crashes due to invalid pointers).
>>>
>>> I fixed it to
>>>
>>> memset(hashtable->buckets, 0, nbuckets * sizeof(HashJoinTuple));
>>>
>>> Fixed patch attached (marked as v2).
>>>
>> Thanks, it was my bug, but oversight.
>>
>> I want committer to push this fix.
>
> I'm not in agreement with this fix, and would prefer to instead
> proceed by limiting the initial allocation to 1GB. Otherwise, as
> KaiGai has mentioned, we might end up trying to allocate completely
> unreasonable amounts of memory if the planner gives a bad estimate.
> Of course, it's true (as Tomas points out) that this issue already
> exists today to some degree, and it's also true (as he also points
> out) that 1GB is an arbitrary limit. But it's also true that we use
> that same arbitrary 1GB limit in a lot of places, so it's hardly
> without precedent.

AFAIK the limit is not 1GB, but 512MB (because we use 2^N and the actual
limit is 1GB-1B). But that's a minor detail.

Also, I'm not sure what other places do you have in mind (could you list
some examples?) but I'd bet we limit the allocation to 1GB because of
the palloc() limit and not because of fear of over-estimates.

> More importantly, removing the cap on the allocation size makes the
> problem a lot worse. You might be sad if a bad planner estimate
> causes the planner to allocate 1GB when 64MB would have been enough,
> but on modern systems it is not likely to be an enormous problem. If
> a similar mis-estimation causes the planner to allocate 16GB rather
> than 1GB, the opportunity for you to be sad is magnified pretty
> considerably. Therefore, I don't really see the over-estimation bug
> fix as being separate from this one.

Perhaps. But if you want to absolutely prevent such sadness then maybe
you should not set work_mem that high?

Anyway, I do see this as a rather orthogonal problem - an independent
improvement, mostly unrelated to the bugfix. Even if we decide to
redesign it like this (and I'm not particularly opposed to that,
assuming someone takes the time to measure how expensive the additional
resize actually is), we'll still have to fix the repalloc().

So I still fail to see why we shouldn't apply this fix.

regards

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2015-09-08 21:05:04 Re: missing locking in at least INSERT INTO view WITH CHECK
Previous Message Rugal Bernstein 2015-09-08 21:01:26 Re: A better translation version of Chinese for psql/po/zh_CN.po file