Re: Expression index ignores column statistics target

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Michael Fuhr <mike(at)fuhr(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Expression index ignores column statistics target
Date: 2005-10-01 03:59:26
Message-ID: 200510010359.j913xQs15655@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


This is expected. The main TODO items is:

* Allow accurate statistics to be collected on indexes with more than
one column or expression indexes, perhaps using per-index statistics

Basically, we don't have multi-column or expression statistics. ANALYZE
just analyzes columns, even if an expression index exists.

---------------------------------------------------------------------------

Michael Fuhr wrote:
> I've noticed that row count estimates for expression indexes appear
> to rely on default_statistics_target rather than on a column's
> actual statistics target. That is, if I use ALTER TABLE SET
> STATISTICS to increase a column's statistics target and then run
> ANALYZE, then estimates for non-expression-index queries improve
> as expected. However, queries that use an expression index remain
> accurate for only around the N most common values, where N is the
> default_statistics_target that was in effect when ANALYZE ran. I'm
> still rummaging through the archives looking for past discussion;
> is this behavior a known limitation or just an oversight?
>
> CREATE TABLE foo (x integer);
>
> CREATE INDEX foo_x_idx ON foo (x);
> CREATE INDEX foo_abs_x_idx ON foo (abs(x));
>
> INSERT INTO foo (x)
> SELECT r1 % r2
> FROM generate_series(1, 100) AS g1(r1),
> generate_series(1, 100) AS g2(r2);
>
> SET default_statistics_target TO 15;
> ALTER TABLE foo ALTER COLUMN x SET STATISTICS 20;
> ANALYZE foo;
>
> SELECT most_common_vals FROM pg_stats WHERE attname = 'x';
> most_common_vals
> -----------------------------------------------------
> {0,1,2,3,4,5,6,7,8,10,9,11,12,14,13,15,16,19,17,18}
> (1 row)
>
> EXPLAIN ANALYZE SELECT * FROM foo WHERE x = 13;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on foo (cost=2.72..50.28 rows=205 width=4) (actual time=0.370..1.766 rows=220 loops=1)
> Recheck Cond: (x = 13)
> -> Bitmap Index Scan on foo_x_idx (cost=0.00..2.72 rows=205 width=0) (actual time=0.314..0.314 rows=220 loops=1)
> Index Cond: (x = 13)
> Total runtime: 2.905 ms
> (5 rows)
>
> EXPLAIN ANALYZE SELECT * FROM foo WHERE abs(x) = 13;
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on foo (cost=2.72..50.80 rows=205 width=4) (actual time=0.358..1.720 rows=220 loops=1)
> Recheck Cond: (abs(x) = 13)
> -> Bitmap Index Scan on foo_abs_x_idx (cost=0.00..2.72 rows=205 width=0) (actual time=0.305..0.305 rows=220 loops=1)
> Index Cond: (abs(x) = 13)
> Total runtime: 2.875 ms
> (5 rows)
>
> EXPLAIN ANALYZE SELECT * FROM foo WHERE x = 18;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on foo (cost=2.60..49.75 rows=172 width=4) (actual time=0.312..1.442 rows=180 loops=1)
> Recheck Cond: (x = 18)
> -> Bitmap Index Scan on foo_x_idx (cost=0.00..2.60 rows=172 width=0) (actual time=0.262..0.262 rows=180 loops=1)
> Index Cond: (x = 18)
> Total runtime: 2.393 ms
> (5 rows)
>
> EXPLAIN ANALYZE SELECT * FROM foo WHERE abs(x) = 18;
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on foo (cost=2.22..43.65 rows=63 width=4) (actual time=0.313..1.436 rows=180 loops=1)
> Recheck Cond: (abs(x) = 18)
> -> Bitmap Index Scan on foo_abs_x_idx (cost=0.00..2.22 rows=63 width=0) (actual time=0.263..0.263 rows=180 loops=1)
> Index Cond: (abs(x) = 18)
> Total runtime: 2.418 ms
> (5 rows)
>
> --
> Michael Fuhr
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-10-01 04:53:03 Re: Expression index ignores column statistics target
Previous Message Michael Fuhr 2005-10-01 02:15:46 Expression index ignores column statistics target