Re: Index Selection: ORDER BY vs. PRIMARY KEY

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Thomas F(dot) O'Connell" <tfo(at)sitening(dot)com>
Cc: PgSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Index Selection: ORDER BY vs. PRIMARY KEY
Date: 2005-09-20 03:05:41
Message-ID: 1648.1127185541@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Thomas F. O'Connell" <tfo(at)sitening(dot)com> writes:
> Clearly, if the index on the timestamp field is there, postgres wants
> to use it for the ORDER BY, even though the performance is worse. How
> is this preference made internally? If both indexes exist, will
> postgres always prefer the index on an ordered column? If I need the
> index on the timestamp field for other queries, is my best bet just
> to increase sort_mem for this query?

If you suppose that Postgres has a "preference" for one index over
another, you're already fatally off track. It's all about estimated
costs. In this case, the plan with h_action_ts_idx is preferred because
it has a lower estimated cost (2196.30) than the other plan (17041.66).
The way to think about this is not that Postgres "prefers" one index
over another, but that the estimated costs aren't in line with reality.

It looks from the plans that there are a number of estimation errors
giving you trouble, but the one that seems most easily fixable is
here:

-> Index Scan using h_action_id_idx on history h (cost=0.00..13260.87 rows=3820 width=480) (actual time=0.184..0.195 rows=3 loops=1)
Index Cond: (action_id = $constant_data::bigint)

Estimating 3820 rows matching $constant_data when there are really only
3 is a pretty serious estimation error :-( ... certainly more than
enough to explain a factor-of-100 error in the total estimated costs.

How recently did you last ANALYZE the history file? If the ANALYZE
stats are up-to-date and it's still blowing the rowcount estimate by
a factor of 1000, maybe you need to increase the statistics target for
this column.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-09-20 03:21:23 Re: How many tables is too many tables?
Previous Message Stephan Szabo 2005-09-19 23:07:44 Re: How can this be?