Re: [HACKERS] Costs: Index vs Non-Index

From: The Hermit Hacker <scrappy(at)hub(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Costs: Index vs Non-Index
Date: 2000-01-08 18:58:54
Message-ID: Pine.BSF.4.21.0001081457420.18498-100000@thelab.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Okay, I had remembered to VACUUM, but I always forget to VACUUM ANALYZE :(
results come out much better now:

udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
NOTICE: QUERY PLAN:

Seq Scan on url (cost=3368.58 rows=12623 width=4)

EXPLAIN
udmsearch=> select (next_index_time) from url where next_index_time <= 947317073;
next_index_time
---------------
(0 rows)

udmsearch=> create index url_next_index_time on url using btree ( next_index_time);
CREATE
udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
NOTICE: QUERY PLAN:

Index Scan using url_next_index_time on url (cost=1364.10 rows=12623 width=4)

EXPLAIN

On Sat, 8 Jan 2000, Tom Lane wrote:

> The Hermit Hacker <scrappy(at)hub(dot)org> writes:
> > Hrmmm...if I'm reading this right, its more costly to create an index then
> > to leave it as a sequential scan, but it returns more rows? Yet, it
> > returns, if I do the query with a count() around the return value, 288
> > rows, not 334 or 1154...
>
> This doesn't have anything to do with index vs sequential scan, but it
> does have to do with whether you've done a VACUUM ANALYZE lately.
> You haven't ;-)
>
> > udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
> > NOTICE: QUERY PLAN:
> > Seq Scan on url (cost=43.00 rows=334 width=4)
>
> IIRC, rows=334 is the default estimate of result rows you will get for
> this query in the absence of any information whatever. (Default table
> size guess is 1000 rows, and default selectivity guess for <= is 1/3,
> so...) If you have not vacuumed, it's sheer coincidence that this is
> even within hailing distance of the correct figure of 288.
>
> > udmsearch=> create index url_next_index_time on url using btree ( next_index_time);
> > CREATE
> > udmsearch=> explain select next_index_time from url where next_index_time <= 947317073;
> > NOTICE: QUERY PLAN:
> > Index Scan using url_next_index_time on url (cost=271.68 rows=1154 width=4)
>
> I believe that a side-effect of CREATE INDEX is to update the
> number-of-pages-and-rows statistics in pg_class for the target table.
> So after you do that, the optimizer has a correct idea of the table's
> size, but still no more info about the selectivity of the WHERE clause.
> (I infer that your table has size 1154*3 rows.) If you now drop the
> index and repeat EXPLAIN, it'll go back to a seq scan, but it will now
> say 1154 rows --- and the cost estimate will be higher, too.
>
> If you do VACUUM ANALYZE, then the optimizer will also know the min and
> max values of next_index_time, and will have some shot at making a
> correct estimate of the output row count. I'd be interested to know
> what it predicts then...
>
> regards, tom lane
>

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy(at)hub(dot)org secondary: scrappy(at){freebsd|postgresql}.org

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message The Hermit Hacker 2000-01-08 19:05:12 Re: [HACKERS] Another index "buglet"?
Previous Message Ed Loehr 2000-01-08 18:29:40 Re: [HACKERS] Re: ERROR: out of free buffers: time to abort !