Re: About b-tree usage

From: Jeff Davis <jdavis-pgsql(at)empires(dot)org>
To: Ioannis Theoharis <theohari(at)ics(dot)forth(dot)gr>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: About b-tree usage
Date: 2005-03-07 19:04:26
Message-ID: 1110222266.4089.423.camel@jeff
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


In that case, sequential scan is faster, but perhaps the planner doesn't
know that ahead of time. Try turning on more statistics if you haven't
already, and then run ANALYZE again. If the planner sees a range,
perhaps it assumes that it is a highly selective range, when in fact, it
consists of all of the tuples. Also, make sure enable_seqscan is true
(in case you turned it off for testing or something and forgot).

A seqscan is usually faster when a large proportion of the tuples are
returned because:
(1) It uses sequential I/O; whereas an index might access tuples in a
random order.
(2) It doesn't have to read the index's disk pages at all.

I suspect you don't need to return all the tuples in the table. If you
include the details of a real scenario perhaps the people on the list
could be more helpful.

Regards,
Jeff Davis

On Mon, 2005-03-07 at 20:43 +0200, Ioannis Theoharis wrote:
> > If there are many identical values in att0, are you sure a sequential
> > scan isn't more efficient? Also, are you sure the index isn't working
> > well? It seems to me since you have the table clustered, it might be
> > fairly efficient as-is (it would get a huge benefit from the spatial
> > locality of the tuples in the table). Index size alone shouldn't destroy
> > your performance, since the idea of an index lookup is that it only has
> > to read O(log n) pages from the disk per lookup.
>
> In the next example, have in mind that:
> select relname, relpages, reltuples from pg_class;
>
> relname | relpages | reltuples
> --------------------------------+----------+-------------
> ...
> tc2000000000 | 142858 | 1.00001e+06
> inst_id_idx | 2745 | 1e+06
> ...
>
> and that i run postgresql, on a UltraSPARC[tm] III 600MHz, ram: 512MB
> OS : sol 9
>
> att0: varchar(1000)
> att1: int4
> and that 0<=att1>=900000000 for every tuple of tabe and index.
>
> query:
> select att0 from tc2000000000 where att1=900000000 AND att1>=0
>
> plan:
> Index Scan using inst_id_idx on tc2000000000 (cost=0.00..161603.06
> rows=1000006 width=1004) (actual time=41.21..101917.36 rows=1000000 loops=1)
> Index Cond: ((att1 <= 900000000) AND (att1 >= 0))
> Total runtime: 103135.03 msec
>
>
> query:
> select att0 from tc2000000000
>
> plan:
> Seq Scan on tc2000000000 (cost=100000000.00..100152858.06 rows=1000006
> width=1004) (actual time=0.21..42584.87 rows=1000000 loops=1)
> Total runtime: 43770.73 msec
>
> Can you explain me this big difference? Perhaps postgresql caches in
> memory a big part (or the whole) of index?
>
> And by the way why postgresql doesn't select sequential scan? (I have
> done vacuum analyze).
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-03-07 19:09:11 Re: About b-tree usage
Previous Message Jim Buttafuoco 2005-03-07 19:04:20 Re: Recording vacuum/analyze/dump times