Re: DBT-3 with SF=20 got failed

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: DBT-3 with SF=20 got failed
Date: 2015-09-24 16:40:45
Message-ID: 5604278D.4030003@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/24/2015 05:09 PM, Robert Haas wrote:
> On Thu, Sep 24, 2015 at 9:49 AM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> So while it does not introduce behavior change in this particular
>> case (because it fails, as you point out), it introduces a behavior
>> change in general - it simply triggers behavior that does not
>> happen below the limit. Would we accept the change if the proposed
>> limit was 256MB, for example?
>
> So, I'm a huge fan of arbitrary limits.
>
> That's probably the single thing I'll say this year that sounds most
> like a troll, but it isn't. I really, honestly believe that.
> Doubling things is very sensible when they are small, but at some
> point it ceases to be sensible. The fact that we can't set a
> black-and-white threshold as to when we've crossed over that line
> doesn't mean that there is no line. We can't imagine that the
> occasional 32GB allocation when 4GB would have been optimal is no
> more problematic than the occasional 32MB allocation when 4MB would
> have been optimal. Where exactly to put the divider is subjective,
> but "what palloc will take" is not an obviously unreasonable
> barometer.

There are two machines - one with 32GB of RAM and work_mem=2GB, the
other one with 256GB of RAM and work_mem=16GB. The machines are hosting
about the same data, just scaled accordingly (~8x more data on the large
machine).

Let's assume there's a significant over-estimate - we expect to get
about 10x the actual number of tuples, and the hash table is expected to
almost exactly fill work_mem. Using the 1:3 ratio (as in the query at
the beginning of this thread) we'll use ~512MB and ~4GB for the buckets,
and the rest is for entries.

Thanks to the 10x over-estimate, ~64MB and 512MB would be enough for the
buckets, so we're wasting ~448MB (13% of RAM) on the small machine and
~3.5GB (~1.3%) on the large machine.

How does it make any sense to address the 1.3% and not the 13%?

> Of course, if we can postpone sizing the hash table until after the
> input size is known, as you suggest, then that would be better still
> (but not back-patch material).

This dynamic resize is 9.5-only anyway.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2015-09-24 16:43:19 Re: multivariate statistics / patch v7
Previous Message Tom Lane 2015-09-24 16:35:49 Re: clearing opfuncid vs. parallel query