Re: Seqscan rather than Index

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Richard Huxton <dev(at)archonet(dot)com>, David Brown <time(at)bigpond(dot)net(dot)au>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Seqscan rather than Index
Date: 2004-12-17 18:24:33
Message-ID: 87zn0cixzy.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Greg Stark <gsstark(at)mit(dot)edu> writes:
> > Postgres is also more pessimistic about the efficiency of index scans. It's
> > willing to use a sequential scan down to well below 5% selectivity when other
> > databases use the more traditional rule of thumb of 10%.
>
> However, other databases are probably basing their analysis on a
> different execution model. Since we have to visit both heap and index
> in all cases, we do indeed have a larger penalty for index use.

It's only in special cases that other databases do not have to look at the
heap. For simple queries like "select * from x where foo > ?" they still have
to look at the heap. I never looked into how much of a bonus Oracle gives for
the index-only case, I'm not sure it even takes it into account.

> I've looked pretty closely at the cost model for index access, believe me.
> It's not pessimistic; if anything it is undercharging for index access.

I think there's another effect here beyond the physical arithmetic. There's a
kind of teleological reasoning that goes something like "If the user created
the index chances are it's because he wanted it to be used".

I guess that argues more for more aggressive selectivity estimates than for
biased index costing though. If I'm doing "where foo > ?" then if there's an
index on foo I probably put it there for a reason and want it to be used even
if postgres doesn't really have a clue how selective the query will be.

> I think the one effect that's not being modeled is amortization of index
> fetches across successive queries.

And across multiple fetches in a single query, such as with a nested loop.

It seems like the effective_cache_size parameter should be having some
influence here.

--
greg

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Pailloncy Jean-Gerard 2004-12-17 18:25:29 Re: Error in VACUUM FULL VERBOSE ANALYZE (not enough memory)
Previous Message Josh Berkus 2004-12-17 18:15:54 Re: Error in VACUUM FULL VERBOSE ANALYZE (not enough memory)