Re: bad estimation together with large work_mem generates terrible slow hash joins

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: bad estimation together with large work_mem generates terrible slow hash joins
Date: 2014-09-11 13:31:41
Message-ID: CA+TgmoaPmSCBC0FOee37LumyX=aY+8ZOExUziBaxJu_LjuBWgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 10, 2014 at 5:09 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> OK. So here's v13 of the patch, reflecting this change.

With the exception of ExecChooseHashTableSize() and a lot of stylistic
issues along the lines of what I've already complained about, this
patch seems pretty good to me. It does three things:

(1) It changes NTUP_PER_BUCKET to 1. Although this increases memory
utilization, it also improves hash table performance, and the
additional memory we use is less than what we saved as a result of
commit 45f6240a8fa9d35548eb2ef23dba2c11540aa02a.

(2) It changes ExecChooseHashTableSize() to include the effect of the
memory consumed by the bucket array. If we care about properly
respecting work_mem, this is clearly right for any NTUP_PER_BUCKET
setting, but more important for a small setting like 1 than for the
current value of 10. I'm not sure the logic in this function is as
clean and simple as it can be just yet.

(3) It allows the number of batches to increase on the fly while the
hash join is in process. This case arises when we initially estimate
that we only need a small hash table, and then it turns out that there
are more tuples than we expect. Without this code, the hash table's
load factor gets too high and things start to suck.

I recommend separating this patch in two patches, the first covering
items (1) and (2) and the second covering item (3), which really seems
like a separate improvement.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Silva 2014-09-11 13:32:24 Re: Memory Alignment in Postgres
Previous Message Andres Freund 2014-09-11 13:29:53 Re: Scaling shared buffer eviction