Re: Hardware upgrade for a high-traffic database

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Jason Coene" <jcoene(at)gotfrag(dot)com>
Cc: "'Merlin Moncure'" <merlin(dot)moncure(at)rcsonline(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Hardware upgrade for a high-traffic database
Date: 2004-08-11 23:20:04
Message-ID: 21682.1092266404@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Jason Coene" <jcoene(at)gotfrag(dot)com> writes:
> A good example, a comments table where users submit TEXT data. A common
> query is to find the last 5 comments a user has submitted. The scan, while
> using an index, takes a considerable amount of time (> 0.5 sec is about as
> good as it gets). Again, it's using an index on the single WHERE clause
> (userid = int). The field that's used to ORDER BY (timestamp) is also
> indexed.

You mean you are doing
SELECT ... WHERE userid = 42 ORDER BY timestamp DESC LIMIT 5;
and hoping that separate indexes on userid and timestamp will get the
job done? They won't. There are only two possible plans for this,
neither very good: select all of user 42's posts and sort them, or
scan timewise backwards through *all* posts looking for the last 5 from
user 42.

If you do this enough to justify a specialized index, I would suggest a
two-column index on (userid, timestamp). You will also need to tweak
the query, because the planner is not quite smart enough to deduce that
such an index is applicable to the given sort order:
SELECT ... WHERE userid = 42 ORDER BY userid DESC, timestamp DESC LIMIT 5;
This should generate an index-scan-backwards plan that will execute nigh
instantaneously, because it will only fetch the rows you really want.

You might or might not be able to drop the separate indexes on userid
and timestamp, depending on what other queries you might have that need
them. But you should be paying attention to what plans you are really
getting (see EXPLAIN) rather than just assuming that some indexes chosen
at random will do what you need.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Russell Smith 2004-08-11 23:44:56 Re: Storing binary data.
Previous Message Jason Coene 2004-08-11 22:53:52 Re: Hardware upgrade for a high-traffic database