Re: Indexes not always used after inserts/updates/vacuum

From: Reinhard Max <max(at)suse(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Michael G(dot) Martin" <michael(at)vpmonline(dot)com>, <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Indexes not always used after inserts/updates/vacuum
Date: 2002-02-28 16:43:39
Message-ID: Pine.LNX.4.44.0202281634290.17044-100000@Wotan.suse.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, 28 Feb 2002 at 10:15, Tom Lane wrote:

> Okay. It looks like foo.id has a pretty strong but not perfect
> descending order (the correlation statistic is -0.563276). The
> planner is evidently not rating that effect strongly enough.

Yes, that seems to be the reason. When I try

SELECT * into foo2 from foo order by id;
CREATE index foo2_id on foo2(id);
VACUUM ANALYZE foo2;

and repeat the join with foo2 instead of foo, index scans are used
even when seqscans are not forbidden.

> [...]
> It might be interesting to replace csquared with just
> fabs(indexCorrelation) to see if the results are better. Also, if you
> cared to step through the code with a debugger or add some printout
> statements, we could learn what the min and max costs are that it's
> interpolating between; that'd be interesting to know as well.

OK, this is what I've changed:

- csquared = indexCorrelation * indexCorrelation;
+ elog(NOTICE, "min_IO_cost = %f, max_IO_cost = %f, indexCorrelation = %f",
+ min_IO_cost, max_IO_cost, indexCorrelation);
+ csquared = fabs (indexCorrelation);

Are these the addtional values you wanted to see?

These are the results:

max=# EXPLAIN analyze SELECT count(foo.id)
FROM foo, bar WHERE foo.id = bar.ref2foo;

NOTICE: min_IO_cost = 299.000000, max_IO_cost = 1196.000000
indexCorrelation = -1.000000
NOTICE: min_IO_cost = 1.000000, max_IO_cost = 3.993322
indexCorrelation = -1.000000
NOTICE: min_IO_cost = 5880.000000, max_IO_cost = 1169154.985307
indexCorrelation = -0.532557
NOTICE: min_IO_cost = 1.000000, max_IO_cost = 3.999660
indexCorrelation = -0.532557
NOTICE: QUERY PLAN:

Aggregate (cost=18709.65..18709.65 rows=1 width=8)
(actual time=7229.15..7229.15 rows=1 loops=1)
-> Hash Join (cost=911.39..18613.58 rows=38431 width=8)
(actual time=208.23..7184.68 rows=38431 loops=1)
-> Seq Scan on foo
(cost=0.00..9400.72 rows=352072 width=4)
(actual time=0.02..810.92 rows=352072 loops=1)
-> Hash (cost=683.31..683.31 rows=38431 width=4)
(actual time=149.87..149.87 rows=0 loops=1)
-> Seq Scan on bar
(cost=0.00..683.31 rows=38431 width=4)
(actual time=0.02..83.32 rows=38431 loops=1)
Total runtime: 7229.29 msec

EXPLAIN
max=# EXPLAIN analyze SELECT count(foo2.id)
FROM foo2, bar WHERE foo2.id = bar.ref2foo;

NOTICE: min_IO_cost = 299.000000, max_IO_cost = 1196.000000
indexCorrelation = -1.000000
NOTICE: min_IO_cost = 1.000000, max_IO_cost = 3.993322
indexCorrelation = -1.000000
NOTICE: min_IO_cost = 5741.000000, max_IO_cost = 1163366.000920
indexCorrelation = 1.000000
NOTICE: min_IO_cost = 1.000000, max_IO_cost = 3.999652
indexCorrelation = 1.000000
NOTICE: QUERY PLAN:

Aggregate (cost=12748.26..12748.26 rows=1 width=8)
(actual time=687.08..687.08 rows=1 loops=1)
-> Merge Join (cost=0.00..12652.18 rows=38431 width=8)
(actual time=0.44..633.53 rows=38431 loops=1)
-> Index Scan using foo2_pkey on foo2
(cost=0.00..10387.79 rows=352072 width=4)
(actual time=0.26..174.32 rows=38432 loops=1)
-> Index Scan using idx_bar_ref2foo on bar
(cost=0.00..807.74 rows=38431 width=4)
(actual time=0.17..180.34 rows=38431 loops=1)
Total runtime: 687.31 msec

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Steve Wranovsky 2002-02-28 18:52:47 Incorrect PostgreSQL 7.2 ODBC RPM for RedHat 6.2
Previous Message free 2002-02-28 15:55:48 RSERV AND POSTGRESQL 7.2