Re: [HACKERS] RE: What database i can use? (fwd)

From: Don Baccus <dhogaza(at)pacifier(dot)com>
To: "Marc G(dot) Fournier" <scrappy(at)hub(dot)org>, pgsql-hackers(at)postgreSQL(dot)org
Cc: berend(at)pobox(dot)com
Subject: Re: [HACKERS] RE: What database i can use? (fwd)
Date: 1999-12-10 11:08:52
Message-ID: 3.0.1.32.19991210030852.00ed1ed4@mail.pacifier.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At 09:14 PM 12/27/99 -0500, Marc G. Fournier wrote:
>
>For those working on INNER/OUTER Joins...any comments? :)

I'm not working on them (or on Postgres at all, other than steadily
plowing through the code to familiarize myself with it) but I'm
always willing to comment...

>
>> JOIN statement? I take it that this is different then:
>>
>> SELECT a.field1, b.field2 from table1 a, table2 b where a.key = b.key
>
>ANSI92 supports the far better readable JOIN statement:
>
>
>select a.field1, b.field2
> from table1 a
> join table2 b on
> a.key = b.key

He's right that they are different, but they give the same result.

Wearing my compiler-writer's hat, something like:

select a.field1, b.field2 from table1 a, table2 b where a.key=b.key

says "cross join table1 and table2, then return only those rows
where a.key=b.key"

in other words, it's not (strictly speaking) an inner join.

However...the rows returned by this are the same as the rows
returned by an inner join. One could look at the traditional
implementation as an inner join as being an OPTIMIZATION of
this query. It qualifies as an optimization in the sense that
it's certainly far faster for the vast majority of such queries!

>From my reading of the standard (or Date's review of it), this
is really how the standard defines things, i.e. an inner join
are explicitly given in the "from" clause.

>
>
>Left outer joins are now easy to:
>
>select a.field1, b.field2
> from table1 a
> left outer join table2 b on
> a.key = b.key
>
>
>It generally parses and optimizes faster too. For MS SQL Server I've seen
>improvements of up to 75% percent: execution time was the same, but the plan
>was calculated much faster.

This is a bit surprising to me. One source might be the fact that outer
joins aren't associative (SQL for smarties gives examples), so outer joins
appearing in the "from" clause may simply force left-to-right execution
which reduces the number of cases a plan optimizer (whatever Sybase/SQL server
uses) must consider.

Or it may be that SQL server just executes ALL joins, inner or outer,
explicitly listed in the "from" clause in left-to-right order under
the assumption that the programmer knows best. I kinda doubt that,
though. If true, it would certainly simplify plan optimization, there
wouldn't be any other than deciding what kind of join and which indices
to use for each one (as opposed to figuring out that plus which order
of execution).

>From my reading of the work done on joins thus far for Postgres, the
plan optimizer will be fed essentially the same information whether
an inner join is listed in the "from" clause or derived from the
"where" clause, so I wouldn't expect to see such speed ups. The
non-associativity of outer joins might impose an ordering on
inner joins mixed in, though (I haven't thought through the cases,
again I'm just reading Postgres code and Date's book on the standard,
I wrote my first SQL query less than a year ago and am still very
much a novice at all this).

- Don Baccus, Portland OR <dhogaza(at)pacifier(dot)com>
Nature photos, on-line guides, Pacific Northwest
Rare Bird Alert Service and other goodies at
http://donb.photo.net.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Don Baccus 1999-12-10 11:17:05 Re: [HACKERS] memory dilemma
Previous Message Theo Kramer 1999-12-10 10:57:17 insert using select with limit