Re: DBT-3 with SF=20 got failed

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DBT-3 with SF=20 got failed
Date: 2015-08-21 19:54:47
Message-ID: 55D78207.1080104@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello KaiGai-san,

On 08/21/2015 02:28 AM, Kouhei Kaigai wrote:
...
>>
>> But what is the impact on queries that actually need more than 1GB
>> of buckets? I assume we'd only limit the initial allocation and
>> still allow the resize based on the actual data (i.e. the 9.5
>> improvement), so the queries would start with 1GB and then resize
>> once finding out the optimal size (as done in 9.5). The resize is
>> not very expensive, but it's not free either, and with so many
>> tuples (requiring more than 1GB of buckets, i.e. ~130M tuples) it's
>> probably just a noise in the total query runtime. But I'd be nice
>> to see some proofs of that ...
>>
> The problem here is we cannot know exact size unless Hash node
> doesn't read entire inner relation. All we can do is relying
> planner's estimation, however, it often computes a crazy number of
> rows. I think resizing of hash buckets is a reasonable compromise.

I understand the estimation problem. The question I think we need to
answer is how to balance the behavior for well- and poorly-estimated
cases. It'd be unfortunate if we lower the memory consumption in the
over-estimated case while significantly slowing down the well-estimated
ones.

I don't think we have a clear answer at this point - maybe it's not a
problem at all and it'll be a win no matter what threshold we choose.
But it's a separate problem from the bugfix.

>> I believe the patch proposed by KaiGai-san is the right one to fix
>> the bug discussed in this thread. My understanding is KaiGai-san
>> withdrew the patch as he wants to extend it to address the
>> over-estimation issue.
>>
>> I don't think we should do that - IMHO that's an unrelated
>> improvement and should be addressed in a separate patch.
>>
> OK, it might not be a problem we should conclude within a few days,
> just before the beta release.

I don't quite see a reason to wait for the over-estimation patch. We
probably should backpatch the bugfix anyway (although it's much less
likely to run into that before 9.5), and we can't really backpatch the
behavior change there (as there's no hash resize).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2015-08-21 20:03:56 Re: Warnings around booleans
Previous Message Merlin Moncure 2015-08-21 19:29:15 minor typo in trigger.c