Re: SQL Query Optimization

From: Dav Coleman <dav(at)danger-island(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: SQL Query Optimization
Date: 2002-04-18 16:35:27
Message-ID: 20020418093527.I24932@danger-island.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

I should be more clear, the problem is that the application user can
basically construct the SQL query dynamically, so I have no control on
how the original SQL query will be formed or what it will consist of.
It can be any possible query in practice. Because of this, it is not just
a matter of analyzing any specific queries, and i don't want to start
creating every possible index (although i might, if i have to).

But I can see where I was heading in the wrong direction already. I was
thinking that what I needed was to find theories/algorithms on how to
rewrite the SQL before submitting it to postgresql, and I maybe still
need to do that, but I guess I also need to EXPLAIN and analyze the
bad vs good forms of the queries so I'll know what makes a 'good' vs
'bad' query (so I'll get a sense on how to rewrite queries). Perhaps
with that understanding, an algorithm for rewriting the queries will
be apparent.

I just figured I couldn't be the first person to run into this problem,
but I can't find it mentioned anywhere.

-Dav

btw, I'm running postgresql-7.1.2 (compilied from source) on rh7.0

Josh Berkus [josh(at)agliodbs(dot)com] wrote:
> Dav,
>
> > I am using postgresql to house chemical informatics data which
> > consists of
> > several interlinked tables with tens of thousands (maximum) of rows.
> > When
> > doing search queries against these tables (which always requires
> > multiple
> > joins) I have noticed that the semantically equivalent SQL queries
> > can differ
> > vastly in speed performance depending on the order of clauses ANDed
> > together ( "WHERE cond1 AND cond2" takes forever, but "WHERE cond2
> > AND cond1" comes right back).
>
> In most cases, the above kind of optimization difference is due to how
> you indexed the table. If, for example, you have an index on (field2,
> field1), and you do a "WHERE field1 = y and field2 = x" then the query
> parser probably won't use the index because the field order is
> different.
>
> Fortunately, in Postgres 7.2, you now get index usage statistics.
> Hopefully another user will follow-up this e-mail by explaining how to
> access them.
>
> The idea is that, if you find that certain views and queries are very
> slow, then check what tables they all have in common. Then check the
> indexes and statistics for each table. If you see a large table with
> only 3 indexes, none of which are getting much use, then they are
> pobpably the wrong indexes or you need to change the structure of your
> WHERE clause. Also, EXPLAIN can be a big help here.
>
> See http://techdocs.postgresql.org for more stuff about optimization.
>
> I can understand that you'd like a tool to make all this easier for
> you, but I haven't seen any such thing, unless it ships with the
> Enterprise version of Oracle.
>
> -Josh Berkus

--
Dav Coleman
http://www.danger-island.com/dav/

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Jan Wieck 2002-04-18 16:43:30 Re: call the same pl/pgsql procedure twice in the same connection
Previous Message Tom Lane 2002-04-18 16:15:43 Re: SQL Query Optimization