Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: hashjoin - gracefully increasing NTUP_PER_BUCKET instead of batching
Date: 2014-12-10 14:02:21
Message-ID: 646916973.4801168.1418220141970.JavaMail.yahoo@jws10090.mail.ne1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:

> back when we were discussing the hashjoin patches (now committed),
> Robert proposed that maybe it'd be a good idea to sometimes increase the
> number of tuples per bucket instead of batching.
>
> That is, while initially sizing the hash table - if the hash table with
> enough buckets to satisfy NTUP_PER_BUCKET load factor does not fit into
> work_mem, try a bit higher load factor before starting to batch.
>
> Attached patch is an initial attempt to implement this - it's a bit
> rough on the edges, but hopefully enough to judge the benefit of this.
>
> The default load factor is 1. The patch tries to degrade this to 2, 4 or
> 8 in attempt to fit the hash table into work mem. If it doesn't work, it
> starts batching with the default load factor. If the batching is
> required while the hashjoin is running, the load factor is switched back
> to the default one (if we're batching, there's no point in keeping the
> slower hash table).
>
> The patch also modifies explain output, to show the load factor.
>
> The testing I've done so far are rather disappointing, though.
>
> create table a as select i from generate_series(1,1000000) s(i);
> create table b as select i from generate_series(1,1000000) s(i);
>
> analyze a;
> analyze b;
>
> select a.i, b.i from a join b on (a.i = b.i);
>
> work_mem batches tuples per bucket duration
> -----------------------------------------------------
> 64 MB 1 1 585 ms
> 46 MB 1 2 639 ms
> 43 MB 1 4 794 ms
> 40 MB 1 8 1075 ms
> 39 MB 2 1 623 ms
>
> So, even increasing the load factor to 2 is slower than batching.

Right, I saw the same thing.

I tried pretty hard to create a case where increasing the load
factor from 1 to 2 was faster than going to a second batch, and was
unable to do so. I hypothesized that the second batch was cached
by the OS and flipping the data in and out of the OS cache was
faster than chasing through the links. I expect that if you have a
large enough hashtable to barely exceed your machines ability to
cache, you *might* see a benefit in the hash table access by
increasing the load factor. I think it would be incredibly hard to
accurately identify those cases, and every time a hash table was
used there would be a cost to trying to figure it out. I just
can't see this being a win.

> I'm not sure what's the best way forward - clearly, for cases that fit
> into RAM (temp files into page cache), batching is faster. For tables
> large enough to cause a lot of I/O, it may make a difference - but I'm
> not sure how to identify these cases.
>
> So ISTM implementing this requires a reliable way to identify which case
> we're dealing with - if the outer table is large enough (prefer
> increasing load factor) or not (prefer batching). Until then keeping the
> current simple/predictible approach is probably better.
>
> Of course, this also depends on additional variables (e.g. is the system
> memory-stressed?).

All that is on-target, but my take-away is that increasing load
factor to avoid a second batch was an interesting idea that turns
out to not really be a good one. If someone can craft a
reproducible test case that demonstrates a win for increasing the
load factor that doesn't have significant cost for cases where it
isn't a win, I might change my opinion; but count me as a skeptic.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alex Shulgin 2014-12-10 14:07:26 Re: Small TRUNCATE glitch
Previous Message Robert Haas 2014-12-10 12:50:31 Re: logical column ordering