Re: A better way than tweaking NTUP_PER_BUCKET

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A better way than tweaking NTUP_PER_BUCKET
Date: 2013-06-22 18:54:53
Message-ID: CA+Tgmoa__iCRN+AS_=6VxZx5AkW3nVc4aVxwyMr+DS01Xykw9A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jun 22, 2013 at 9:15 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Previous discussions of Hash Joins have noted that the performance
> decreases when the average number of tuples per bucket increases.
> O(N^2) effects are seen.
>
> We've argued this about many different ways, yet all of those
> discussions have centred around the constant NTUP_PER_BUCKET. I
> believe that was a subtle mistake and I have a proposal.
>
> The code in ExecChooseHashTableSize() line 460 says
> /*
> * Set nbuckets to achieve an average bucket load of NTUP_PER_BUCKET when
> * memory is filled.
> ...
>
> but the calculation then sets the number of buckets like this
>
> dbuckets = ceil(ntuples / NTUP_PER_BUCKET);
>
> **This is doesn't match the comment.** If we underestimate the number
> of tuples and go on to fill the available memory, we then end up with
> an average number of tuples per bucket higher than NTUP_PER_BUCKET. A
> notational confusion that has been skewing the discussion.
>
> The correct calculation that would match the objective set out in the
> comment would be
>
> dbuckets = (hash_table_bytes / tupsize) / NTUP_PER_BUCKET;
>
> Which leads us to a much more useful value of dbuckets in the case
> where using ntuples occupies much less space than is available. This
> value is always same or higher than previously because of the if-test
> that surrounds it.

+1. I think this is very clever.

The other thing that seems to be distorting things here is that this
function doesn't account for the memory consumed by the bucket array.
So as things stand today, changing NTUP_PER_BUCKET can't ever increase
the required number of batches. But that's really nonsensical.
Setting NTUP_PER_BUCKET to a smaller value like 1 or even, in effect,
a value less than 1 would probably improve performance for large hash
tables, but there's no way to decide what value is too expensive
because the memory cost of changing it isn't accounted for in the
first place.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2013-06-22 18:58:50 Re: A better way than tweaking NTUP_PER_BUCKET
Previous Message Robert Haas 2013-06-22 18:36:21 Re: Re: backend hangs at immediate shutdown (Re: Back-branch update releases coming in a couple weeks)