Re: Seqscan rather than Index

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: David Brown <time(at)bigpond(dot)net(dot)au>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Seqscan rather than Index
Date: 2004-12-17 15:47:57
Message-ID: 87acsdkjte.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Richard Huxton <dev(at)archonet(dot)com> writes:

> Not going to do anything in this case. The planner is roughly right about how
> many rows will be returned, it's just not expecting everything to be in RAM.

That doesn't make sense or else it would switch to the index at
random_page_cost = 1.0. If it was still using a sequential scan at
random_page_cost < 1 then perhaps he had some problem with his query like
mismatched data types that forced it to use a full scan.

> > That's the standard advice around here and the only thing I've found
> > useful. Half the threads in this forum are about under-utilized
> > indexes. It would be great if someone could admit the planner is
> > broken and talk about actually fixing it!
>
> Not sure I agree here - when the stats are accurate, you can get the planner to
> make near-optimal choices most of the time. Is there any particular pattern
> you've seen?

The most common cause I've seen here is that Postgres makes very pessimistic
assumptions about selectivity when it doesn't know better. Every other
database I've tested assumes 'col > ?' is about 5% selectivity . Postgres
assumes 33%.

Postgres is also more pessimistic about the efficiency of index scans. It's
willing to use a sequential scan down to well below 5% selectivity when other
databases use the more traditional rule of thumb of 10%.

In combination these effects do seem to cause an _awful_ lot of complaints.

> > The issue hits PostgreSQL harder than others because of its awful
> > sequential scan speed, which is two to five times slower than other
> > DBMS. The archives show there has been talk for years about this, but
> > it seems, no solution. The obvious thing to consider is the block
> > size, but people have tried increasing this in the past with only
> > marginal success.
>
> Must admit this puzzles me. Are you saying you can't saturate your disk I/O? Or
> are you saying other DBMS store records in 0.5 to 0.2 times less space than PG?

I don't know what he's talking about either. Perhaps he's thinking of people
who haven't been running vacuum enough?

--
greg

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2004-12-17 17:44:41 Re: Seqscan rather than Index
Previous Message Richard Huxton 2004-12-17 13:03:50 Re: Seqscan rather than Index