Skip site navigation (1) Skip section navigation (2)

Postgresql selecting strange index for simple query

From: Maxim Boguk <mboguk(at)masterhost(dot)ru>
To: pgsql-general(at)postgresql(dot)org
Subject: Postgresql selecting strange index for simple query
Date: 2009-02-26 15:40:28
Message-ID: 49A6B7EC.8030406@masterhost.ru (view raw or flat)
Thread:
Lists: pgsql-general
First some details about server:

hh=# select version();
                                            version
----------------------------------------------------------------------------------------------
  PostgreSQL 8.3.5 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real (Debian 4.3.2-1) 4.3.2
(1 row)

hh=# SHOW effective_cache_size;
  effective_cache_size
----------------------
  32GB
(1 row)

hh=# SHOW  random_page_cost;
  random_page_cost
------------------
  1
(1 row)
(random_page_cost set to 1 because whole DB filling in RAM).

Now troublesome query: select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00'

Table resume have like 70 columns (structure i can post if required) and have 2 index which can be used for completing this query:
"resume_last_change_time_idx" btree (last_change_time) WITH (fillfactor=90)
"resume_user_id_disabled_is_finished_last_change_time_idx" btree (user_id, disabled, is_finished, last_change_time) WITH (fillfactor=90)

And have rows:
hh=# SELECT count(*) from resume;
   count
---------
  5843784
(1 row)



And postgres doing very strange thing selecting second index for this query:

hh=# EXPLAIN ANALYZE select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00';
                                                                                         QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=612329.41..612329.42 rows=1 width=0) (actual time=1806.860..1806.861 rows=1 loops=1)
    ->  Index Scan using resume_user_id_disabled_is_finished_last_change_time_idx on resume  (cost=0.00..608254.00 rows=1630162 width=0) (actual 
time=0.020..1648.265 rows=1627437 loops=1)
          Index Cond: (last_change_time > '2008-09-01 00:00:00'::timestamp without time zone)
  Total runtime: 1806.915 ms
(4 rows)

Ofcourse after i drop index resume_user_id_disabled_is_finished_last_change_time_idx postgres start use right index and query become faster:

hh=# drop INDEX resume_user_id_disabled_is_finished_last_change_time_idx;
DROP INDEX
hh=# EXPLAIN ANALYZE select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00';
                                                                           QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=695294.37..695294.38 rows=1 width=0) (actual time=1659.655..1659.655 rows=1 loops=1)
    ->  Index Scan using resume_last_change_time_idx on resume  (cost=0.00..691218.96 rows=1630162 width=0) (actual time=0.021..1500.817 rows=1627437 
loops=1)
          Index Cond: (last_change_time > '2008-09-01 00:00:00'::timestamp without time zone)
  Total runtime: 1659.715 ms
(4 rows)

(all quieris done after analyze and count(*) on resume and done 2-3 time until runtime become stable).


timing real query show same results (eg query with wrong selected index really slower):

hh=# \timing
Timing is on.
hh=# select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00';
   count
---------
  1627437
(1 row)
Time: 1677.731 ms
hh=# drop INDEX resume_user_id_disabled_is_finished_last_change_time_idx;
DROP INDEX
Time: 0.452 ms
hh=# select count(*) from resume where resume.last_change_time > '2008-09-01 00:00:00';
   count
---------
  1627437
(1 row)
Time: 1530.906 ms

Somehow postgres think index scan on singlecolumn index slower comparing to scan on 4th field of 4column index.


PS: very strange but clean room test with only 4 column part of table resume show postgres selecting right plan:
(and full rectreated table (with all 70 columns) have same wrong plan)

hh=# SELECT user_id, disabled, is_finished, last_change_time into test_table from resume;
SELECT
hh=# ANALYZE test_table ;
ANALYZE
hh=# CREATE INDEX test_table_last_change_time on test_table(last_change_time) with (fillfactor=90);
CREATE INDEX
hh=# CREATE INDEX test_table_complex_idx on test_table(user_id, disabled, is_finished, last_change_time) with (fillfactor=90);
CREATE INDEX
hh=# SELECT count(*) from test_table;
   count
---------
  5843784
(1 row)
hh=# EXPLAIN ANALYZE SELECT count(*) from test_table where last_change_time > '2008-09-01 00:00:00';
                                                                            QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=72239.92..72239.93 rows=1 width=0) (actual time=1386.038..1386.038 rows=1 loops=1)
    ->  Index Scan using test_table_last_change_time on test_table  (cost=0.00..68193.49 rows=1618568 width=0) (actual time=0.057..1228.817 
rows=1627437 loops=1)
          Index Cond: (last_change_time > '2008-09-01 00:00:00'::timestamp without time zone)
  Total runtime: 1386.074 ms
(4 rows)

Thanks for help.

-- 
SY, Maxim Boguk

Responses

pgsql-general by date

Next:From: Pavel StehuleDate: 2009-02-26 15:41:06
Subject: Re: Format string for ISO-8601 date and time
Previous:From: Daniel VeriteDate: 2009-02-26 15:33:09
Subject: Re: Format string for ISO-8601 date and time

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group