Re: Index Scans become Seq Scans after VACUUM ANALYSE

From: Lincoln Yeoh <lyeoh(at)pop(dot)jaring(dot)my>
To: mlw <markw(at)mohawksoft(dot)com>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Sullivan <andrew(at)libertyrms(dot)info>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Index Scans become Seq Scans after VACUUM ANALYSE
Date: 2002-04-20 04:10:23
Message-ID: 5.1.0.14.1.20020420113553.02ff6740@192.228.128.13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At 10:48 AM 4/18/02 -0400, mlw wrote:
>Bruce Momjian wrote:
> >
> > Have you tried reducing 'random_page_cost' in postgresql.conf. That
> > should solve most of your problems if you would like more index scans.
>
>My random page cost is 1 :-)

What happens when you set random page cost to 1? Between an index scan of
50% of a table and a full table scan which would the optimizer pick? With
it at 1, what percentage would be the switchover point?

Because I'm thinking that for _repeated_ queries when there is caching the
random page cost for "small" selections may be very low after the first
very costly select (may not be that costly for smart SCSI drives). So
selecting 10% of a table randomly may not be that costly after the first
select. Whereas for sequential scans 100% of the table must fit in the
cache. If the cache is big enough then whichever results in selecting less
should be faster ( noting that typically sequential RAM reads are faster
than random RAM reads ). If the cache is not big enough then selecting less
may be better up till the point where the total amount repeatedly selected
cannot be cached, in which case sequential scans should be better. This is
of course for queries in serial, not queries in parallel. How would one
take these issues into account in an optimizer?

Mark's problems with the optimizer seem to be something else tho:
statistics off.

>I had a database where I had to have "enable_seqscan=false" in the config
>file.
>The nature of the data always makes the statistics bogus, and it always
>refused
>to use the index.
>My one most important experience (I've had more than one) with this whole
>topic
>is DMN's music database, when PostgreSQL uses the index, the query executes in
>a fraction of a second. When "enable_seqscan=true" PostgreSQL refuses to use
>the index, and the query takes a about a minute. No matter how much I analyze,
>I have to disable sequential scan for the system to work correctly.

I'm just wondering why not just use enable_seqscan=false for those
problematic queries as a "hint"? Unless your query does need some seq scans
as well?

By the way, are updates treated the same as selects by the optimizer?

Regards,
Link.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-04-20 04:16:15 Re: Index Scans become Seq Scans after VACUUM ANALYSE
Previous Message Tom Lane 2002-04-20 04:06:11 Re: Schema (namespace) privilege details