Re: SourceForge & Postgres

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: mlw <markw(at)mohawksoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: SourceForge & Postgres
Date: 2000-12-12 21:27:15
Message-ID: 3490.976656435@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

mlw <markw(at)mohawksoft(dot)com> writes:
> btw anyone trying this query should use: "attdispersion"

Sorry about that --- I just copied-and-pasted the query from some notes
that are obsolete as of 7.1...

> cdinfo=# explain select * from ztitles where artistid = 100000220 ;
> NOTICE: QUERY PLAN:

> Seq Scan on ztitles (cost=0.00..4740.75 rows=3163 width=296)

> And this is with "-o -fs"

> Index Scan using ztitles_artistid_ndx on ztitles (cost=0.00..5915.01
> rows=3163 width=296)

> attname | attdispersion | starelid | staattnum | staop | stanullfrac | stacommonfrac | stacommonval | staloval | stahival
> artistid | 0.0477198 | 19274 | 2 | 97 | 0 | 0.149362 | 100050450 | 100000000 | 100055325

The reason why the thing is going for a sequential scan is that
astonishingly high stacommonfrac statistic. Does artistid 100050450
really account for 14.9% of all the rows in your table? (Who is that
anyway? ;-)) If so, a search for artistid 100050450 definitely *should*
use a sequential scan. The problem at hand is estimating the frequency
of entries for some other artistid, given that we only have this much
statistical info available. Obviously the stats are insufficient, and
I hope to do something about that in a release or two, but it ain't
gonna happen for 7.1. In the meantime, if you've got huge outliers
like that, you could try reducing the value of NOT_MOST_COMMON_RATIO
in src/backend/utils/adt/selfuncs.c.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Myers 2000-12-12 21:36:53 Re: Re: COPY BINARY file format proposal
Previous Message Nathan Myers 2000-12-12 21:25:03 Re: RFC C++ Interface