Re: query optimization scenarios 17,701 times faster!!!

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Robert Dyas <rdyas(at)adelphia(dot)net>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: query optimization scenarios 17,701 times faster!!!
Date: 2003-04-24 21:18:42
Message-ID: Pine.LNX.4.33.0304241509500.14627-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 24 Apr 2003, Robert Dyas wrote:

> clause restriction on a column). But the concept is still the same - don't
> spend time joining rows when you are going to later throw them away becuase
> of some where clause restriction. Stated another way, "restrict then join"
> is far more efficient than "join then restrict".

True, but "join then restrict" is guaranteed to get the right answer
through sheer force of will, while restrict then join requires that you
not only try to optimize the query, but you have to make sure you are not
throwing away the wrong ones. I.e. accidentally leaving in extra rows to
throw away costs you CPU time, accidentally tossing the wrong rows gives
bogus results.

The real answer here is that SQL is the answer. It allows you to restrict
the data sets you're playing with before the join, not after. Subselects,
unions, etc... allow you to build a more efficient query that is handling,
in theory, smaller data sets, and should therefore be faster.

I'd love for the planner to be able to optimize everything, but let's face
it, since all databases live in the real world where optimzation can never
be perfect, we should all strive to create SQL queries that hit the fewest
rows needed to do the job, and THEN let the planner take it from there.

We all benefit from faster sorting algorhythms, better indexing methods,
etc. Only people who write very inneficient SQL benefit from the types
of optimizations you're talking about. So, if someone has to put in time
programming, I'd rather it be on things we can all use and benefit from
the most.

When we have async/sync multi-master multi-slave replication, with bit
mapped indexes, and super fast hashing, along with maybe a ccNUMA friendly
caching method that can efficiently handle tens of gigabytes of free RAM,
sure, maybe someone should get around to optimizing the corner cases. But
unconstrained joins or even just poorly thought out ones, are NOT a corner
case, they're a SQL mistake.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephan Szabo 2003-04-24 21:45:49 Re: query optimization scenarios 17,701 times faster!!!
Previous Message Robert Dyas 2003-04-24 21:08:58 Re: query optimization scenarios 17,701 times faster!!!