From: | "Dann Corbit" <DCorbit(at)connx(dot)com> |
---|---|
To: | "Manoel Henrique" <mhenriquesgbd(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Research/Implementation of Nested Loop Join optimization |
Date: | 2008-07-23 20:23:25 |
Message-ID: | D425483C2C5C9F49B5B7A41F8944154701000FA1@postal.corporate.connx.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
When you install the source tree (e.g. in folder \postgresql-8.3.x) you
will want to examine nodeMergejoin.c typically found in a path similar
to this:
\postgresql-8.3.x\src\backend\executor\nodeMergejoin.c
Here are the comments from the version on my machine:
/*
* INTERFACE ROUTINES
* ExecMergeJoin
mergejoin outer and inner relations.
* ExecInitMergeJoin
creates and initializes run time states
* ExecEndMergeJoin
cleans up the node.
*
* NOTES
*
* Merge-join is done by joining the inner
and outer tuples satisfying
* join clauses of the form ((= outerKey
innerKey) ...).
* The join clause list is provided by the
query planner and may contain
* more than one (= outerKey innerKey) clause
(for composite sort key).
*
* However, the query executor needs to know
whether an outer
* tuple is "greater/smaller" than an inner
tuple so that it can
* "synchronize" the two relations. For
example, consider the following
* relations:
*
* outer: (0
^1 1 2 5 5 5 6 6 7) current tuple: 1
* inner: (1
^3 5 5 5 5 6) current tuple: 3
*
* To continue the merge-join, the executor
needs to scan both inner
* and outer relations till the matching
tuples 5. It needs to know
* that currently inner tuple 3 is "greater"
than outer tuple 1 and
* therefore it should scan the outer
relation first to find a
* matching tuple and so on.
*
* Therefore, rather than directly executing
the merge join clauses,
* we evaluate the left and right key
expressions separately and then
* compare the columns one at a time (see
MJCompare). The planner
* passes us enough information about the
sort ordering of the inputs
* to allow us to determine how to make the
comparison. We may use the
* appropriate btree comparison function,
since Postgres' only notion
* of ordering is specified by btree
opfamilies.
*
*
* Consider the above relations and suppose
that the executor has
* just joined the first outer "5" with the
last inner "5". The
* next step is of course to join the second
outer "5" with all
* the inner "5's". This requires
repositioning the inner "cursor"
* to point at the first inner "5". This is
done by "marking" the
* first inner 5 so we can restore the
"cursor" to it before joining
* with the second outer 5. The access method
interface provides
* routines to mark and restore to a tuple.
*
*
* Essential operation of the merge join
algorithm is as follows:
*
* Join {
* get initial outer and
inner tuples
INITIALIZE
* do forever {
* while
(outer != inner) {
SKIP_TEST
*
if (outer < inner)
*
advance outer
SKIPOUTER_ADVANCE
*
else
*
advance inner
SKIPINNER_ADVANCE
* }
* mark inner
position
SKIP_TEST
* do forever
{
*
while (outer == inner) {
*
join tuples
JOINTUPLES
*
advance inner position
NEXTINNER
*
}
*
advance outer position
NEXTOUTER
*
if (outer == mark)
TESTOUTER
*
restore inner position to mark TESTOUTER
*
else
*
break // return to top of outer loop
* }
* }
* }
*
* The merge join operation is coded in the
fashion
* of a state machine. At each state, we do
something and then
* proceed to another state. This state is
stored in the node's
* execution state information and is
preserved across calls to
* ExecMergeJoin. -cim 10/31/89
*/
From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Manoel Henrique
Sent: Wednesday, July 23, 2008 1:17 PM
To: pgsql-hackers(at)postgresql(dot)org
Subject: [HACKERS] Research/Implementation of Nested Loop Join
optimization
Hi! I`m a researcher from PUC-Rio (Brazil) and we`re studying about an
Joins, and we`d like to implement an optimization on the Nested Loop
Join, this optimization consists on while scanning the inner table, the
iteration would go from up-down then backwards(down-up) to take
advantage of the buffer pages in memory.
We`d work with MaterialScan and only NestedLoop (we`re dropping all
indexes and keys to make it this way). The research objective is to show
some students how a DBMS works.
Does PostgreSQL already works this way?
Is it possible to implement such thing? Is it easy? how hard?
Thank you in advance,
Manoel Henrique Souza.
From | Date | Subject | |
---|---|---|---|
Next Message | Markus Wanner | 2008-07-23 20:26:52 | Re: Postgres-R: internal messaging |
Previous Message | Tom Lane | 2008-07-23 20:20:08 | Re: pltcl_*mod commands are broken on Solaris 10 |