Re: PLEASE GOD HELP US!

From: "Uwe C(dot) Schroeder" <uwe(at)oss4u(dot)com>
To: "PgSQL ADMIN" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: PLEASE GOD HELP US!
Date: 2004-10-01 16:48:52
Message-ID: 200410010948.52351.uwe@oss4u.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Friday 01 October 2004 09:04 am, you wrote:

> SQL: SELECT * FROM thread_listing AS t ORDER BY t.status=5
> DESC,t.lastreply desc LIMIT 25 OFFSET 0
> Num Rows: 25
> Affected Rows: 0
> Exec Time: 9.1602659225464

EXPLAIN is your friend here.
Well, the above query is nice for MySQL - being a fast reader. I bet it
doesn't use an index at all. You're selecting ALL rows from the table, then
sort ALL rows and only retrieve 25. This means if you have 1 million rows in
the table, postgres has to sort 1 million rows just to return the latest 25.
Maybe you want to rethink the approach to this. Databases are designed to pull
out specific records from a large set of records and that's where most
databases are good at. Design the query in a way that allows to isolate the
records you want in a "where clause".
In the above query you could already put an index on thread_listing.status and
write:
select * from thread_listing where status=5 order by lastreply desc limit 25
offset 0
Although not yet good, it will already only take the rows with status=5 into
consideration. If you only have 20% of records with status=5 it will not sort
1 million records anymore, but only 200000.

Your problem is not the speed of the DB, it's bad query design (and possibly
bad database design not using keys and indexes properly). Using indexes makes
write operations slower, but in your case that doesn't matter because nobody
will post 1000 threads in a second, however many may want to read them, which
is where the index will increase speed exponentially
Example: I have a data-warehouse on postgres. One particular operation would
store 32 records to one table and afterwards read the records just stored.
Without indexes the read took about 20 seconds, because a lot of joins to
other tables were involved. After I properly indexed the keys used to join to
the other tables the execution time went down to about a second - and the
database isn't even optimized for my application, it's a vanilla standard out
of the box configuration in postgresql.conf (except for increased shared
memory to allow more simultaneous connections). I bet I could get the
execution-time to less than 1/2 second if I'd care about this.

UC

- --
Open Source Solutions 4U, LLC 2570 Fleetwood Drive
Phone: +1 650 872 2425 San Bruno, CA 94066
Cell: +1 650 302 2405 United States
Fax: +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFBXYp0jqGXBvRToM4RAgJkAJ4tu7tPyzieAjQ1+pwFr+6sa1CffgCgxCYb
K6szMn9iVelBmL5tqrE0YS4=
=tYYW
-----END PGP SIGNATURE-----

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Bruno Wolff III 2004-10-01 18:30:59 Re: query on multiple tables in schema
Previous Message Goulet, Dick 2004-10-01 16:46:06 Re: PLEASE GOD HELP US!