Re: annoying query/planner choice

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Rawnsley <ronz(at)ravensfield(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: annoying query/planner choice
Date: 2004-01-12 05:40:13
Message-ID: 21520.1073886013@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Andrew Rawnsley <ronz(at)ravensfield(dot)com> writes:
> I have a situation that is giving me small fits, and would like to see
> if anyone can shed any light on it.

In general, pulling 10% of a table *should* be faster as a seqscan than
an indexscan, except under the most extreme assumptions about clustering
(is the table clustered on site_id, by any chance?). What I suspect is
that the table is a bit larger than your available RAM, so that a
seqscan ends up flushing all of the kernel's cache and forcing a lot of
I/O, whereas an indexscan avoids the cache flush by not touching (quite)
all of the table. The trouble with this is that the index only looks
that good under test conditions, ie, when you repeat it just after an
identical query that pulled all of the needed pages into RAM. Under
realistic load conditions where different site_ids are being hit, the
indexscan is not going to be as good as you think, because it will incur
substantial I/O.

You should try setting up a realistic test load hitting different random
site_ids, and see whether it's really a win to force seqscan off for
this query or not.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2004-01-12 10:16:37 Re: COUNT & Pagination
Previous Message Andrew Rawnsley 2004-01-12 04:05:10 Re: annoying query/planner choice