Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets

From: "Bryce Cutt" <pandasuit(at)gmail(dot)com>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Lawrence, Ramon" <ramon(dot)lawrence(at)ubc(dot)ca>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
Date: 2008-12-20 10:58:37
Message-ID: 1924d1180812200258v2cf4dcdfn5aeee21bb4e3ce0d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert,

I thoroughly appreciate the constructive criticism.

The compile errors are due to my development process being convoluted.
I will endeavor to not waste your time in the future with errors
caused by my development process.

I have updated the code to follow the conventions and suggestions
given. I am now working on adding the requested documentation. I
will not submit the next patch until that is done. The functionality
has not changed so you can performance test with the patch you have.

As for that particularly ugly piece of code. I figured that out while
digging through the selfuncs code. Basically I needed a way to get
the stats tuple for the outer relation join column of the join but to
do that I needed to figure out how to get the actual relation id and
attribute number that was being joined.

I have not yet figured out a better way to do this but I am sure there
is someone on the mailing list with far more knowledge of this than I
have.

I could possibly be more vigorous in testing to make sure the things I
am casting are exactly what I expect. My tests have always been
consistent so far.

I am essentially doing what is done in selfuncs. I believe I could
use the examine_variable() function in selfuncs.c except I would first
need a PlannerInfo and I don't think I can get that from inside the
join initialization code.

- Bryce Cutt

On Mon, Dec 15, 2008 at 8:51 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I have to admit that I haven't fully grokked what this patch is about
> just yet, so what follows is mostly a coding style review at this
> point. It would help a lot if you could add some comments to the new
> functions that are being added to explain the purpose of each at a
> very high level. There's clearly been a lot of thought put into some
> parts of this logic, so it would be worth explaining the reasoning
> behind that logic.
>
> This patch applies clearly against CVS HEAD, but does not compile
> (please fix the warning, too).
>
> nodeHash.c:88: warning: no previous prototype for 'freezeNextMCVPartiton'
> nodeHash.c: In function 'freezeNextMCVPartiton':
> nodeHash.c:148: error: 'struct HashJoinTableData' has no member named 'inTupIOs'
>
> I commented out the offending line. It errored out again here:
>
> nodeHashjoin.c: In function 'getMostCommonValues':
> nodeHashjoin.c:136: error: wrong type argument to unary plus
>
> After removing the stray + sign, it compiled, but failed the
> "rangefuncs" regression test. If you're going to keep the
> enable_hashjoin_usestatmvcs() GUC around, you need to patch
> rangefuncs.out so that the regression tests pass. I think, however,
> that there was some discussion of removing that before the patch is
> committed; if so, please do that instead. Keeping the GUC would also
> require patching the documentation, which the current patch does not
> do.
>
> getMostCommonValues() isn't a good name for a non-static function
> because there's nothing to tip the reader off to the fact that it has
> something to do with hash joins; compare with the other function names
> defined in the same header file. On the flip side, that function has
> only one call site, so it should probably be made static and not
> declared in the header file at all. Some of the other new functions
> need similar treatment. I am also a little suspicious of this bit of
> code:
>
> relid = getrelid(((SeqScan *) ((SeqScanState *)
> outerPlanState(hjstate))->ps.plan)->scanrelid,
> estate->es_range_table);
> clause = (FuncExprState *) lfirst(list_head(hjstate->hashclauses));
> argstate = (ExprState *) lfirst(list_head(clause->args));
> variable = (Var *) argstate->expr;
>
> I'm not very familiar with the hash join code, but it seems like there
> are a lot of assumptions being made there about what things are
> pointing to what other things. Is this this actually safe? And if it
> is, perhaps a comment explaining why?
>
> getMostCommonValues() also appears to be creating and maintaining a
> counter called collisionsWhileHashing, but nothing is ever done with
> the counter. On a similar note, the variables relattnum, atttype, and
> atttypmod don't appear to be necessary; 2 out of 3 of them are only
> used once, so maybe inlining the reference and dropping the variable
> would make more sense. Also, the if (HeapTupleIsValid(statsTuple))
> block encompasses the whole rest of the function, maybe if
> (!HeapTupleIsValid(statsTuple)) return?
>
> I don't understand why
> hashtable->mostCommonTuplePartition[bucket].tuples and .frozen need to
> be initialized to 0. It looks to me like those are in a zero-filled
> array that was just allocated, so it shouldn't be necessary to re-zero
> them, unless I'm missing something.
>
> freezeNextMCVPartiton is mis-spelled consistently throughout (the last
> three letters should be "ion"). I also don't think it makes sense to
> enclose everything but the first two lines of that function in an
> else-block.
>
> There is some initialization code in ExecHashJoin() that looks like it
> belongs in ExecHashTableCreate.
>
> It appears to me that the interface to isAMostCommonValue() could be
> simplified by just making it return the partition number. It could
> perhaps be renamed something like ExecHashGetMCVPartition().
>
> Does ExecHashTableDestroy() need to explicitly pfree
> hashtable->mostCommonTuplePartition and
> hashtable->flushOrderedMostCommonTuplePartition? I would think those
> would be allocated out of hashCxt - if they aren't, they probably
> should be.
>
> Department of minor nitpicks: (1) C++-style comments are not
> permitted, (2) function names need to be capitalized like_this() or
> LikeThis() but not likeThis(), (3) when defining a function, the
> return type should be placed on the line preceding the actual function
> name, so that the function name is at the beginning of the line, (4)
> curly braces should be avoided around a block containing only one
> statement, (5) excessive blank lines should be avoided (for example,
> the one in costsize.c is clearly unnecessary, and there's at least one
> place where you add two consecutive blank lines), and (6) I believe
> the accepted way to write an empty loop is an indented semi-colon on
> the next line, rather than {} on the same line as the while.
>
> I will try to do some more substantive testing of this as well.
>
> ...Robert
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2008-12-20 13:41:02 Re: [COMMITTERS] pgsql: SQL/MED catalog manipulation facilities This doesn't do any
Previous Message Heikki Linnakangas 2008-12-20 09:44:25 Re: [COMMITTERS] pgsql: SQL/MED catalog manipulation facilities This doesn't do any