Postgres not willing to use an index?

From: Mario Splivalo <mario(dot)splivalo(at)megafon(dot)hr>
To: pgsql-performance(at)postgresql(dot)org
Subject: Postgres not willing to use an index?
Date: 2009-02-06 15:43:52
Message-ID: 498C5AB8.7060603@megafon.hr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I have a table, like this:

CREATE TABLE transactions
(
transaction_id integer NOT NULL DEFAULT
nextval('transactions_seq'::regclass),
transaction_type integer NOT NULL,
transaction_client_id integer NOT NULL,
transaction_destination_id integer NOT NULL,
transaction_operator_id integer NOT NULL,
transaction_application_id integer NOT NULL,
transaction_application_service character varying NOT NULL,
transaction_quantity integer NOT NULL,
transaction_time_commit timestamp with time zone NOT NULL,
transaction_time_received timestamp with time zone NOT NULL,
transaction_gateway_id character(36) NOT NULL,
transaction_payment_amount integer NOT NULL DEFAULT 0,
CONSTRAINT transactions_pk PRIMARY KEY (transaction_id),
CONSTRAINT transactions_uq__gateway_id UNIQUE (transaction_gateway_id)
)
WITH (OIDS=FALSE);

Now, all the _type, client_id, destination_id, operator_id, and
application_id are foreigen-keyed to coresponding tables. There are no
indices on those columns.

Besides PK and uq-constraint indices I have this index:

CREATE INDEX transactions_idx__client_data ON transactions
USING btree (transaction_client_id, transaction_destination_id,
transaction_operator_id, transaction_application_id,
transaction_time_commit)

The table_count is like this:

jura=# select count(*) from transactions;
count
----------
13751457
(1 row)

There are roughly 500.000 - 600.000 transactions for each month. There
are also transactions from past two years in the table.

I often SELECT data from the table for specified time period - usualy
from begining to the end of the month, like this:

SELECT <some-columns> FROM transactions WHERE transaction_time_commit
BETWEEN '2009-01-01' AND '2009-01-31 23:59:59';

The problem is that postgres is never using an index:

jura=# explain analyze select * from transactions where
transaction_time_commit between '2009-01-01' and '2009-01-31 23:59:59';

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Seq Scan on transactions (cost=0.00..416865.85 rows=593713 width=91)
(actual time=4.067..3918.629 rows=525051 loops=1)
Filter: ((transaction_time_commit >= '2009-01-01
00:00:00+01'::timestamp with time zone) AND (transaction_time_commit <=
'2009-01-31 23:59:59+01'::timestamp with time zone))
Total runtime: 4026.404 ms
(3 rows)

Time: 4068.521 ms

If I force it not to use sequential scans, it is using index, with
benefits of shorter execution time:
jura=# set enable_seqscan to false;
SET
Time: 0.103 ms
jura=# explain analyze select * from transactions where
transaction_time_commit between '2009-01-01' and '2009-01-31 23:59:59';

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on transactions (cost=410369.98..629869.67
rows=593713 width=91) (actual time=1060.569..1280.500 rows=525051 loops=1)
Recheck Cond: ((transaction_time_commit >= '2009-01-01
00:00:00+01'::timestamp with time zone) AND (transaction_time_commit <=
'2009-01-31 23:59:59+01'::timestamp with time zone))
-> Bitmap Index Scan on transactions_idx__client_data
(cost=0.00..410221.55 rows=593713 width=0) (actual
time=1058.992..1058.992 rows=525051 loops=1)
Index Cond: ((transaction_time_commit >= '2009-01-01
00:00:00+01'::timestamp with time zone) AND (transaction_time_commit <=
'2009-01-31 23:59:59+01'::timestamp with time zone))
Total runtime: 1388.882 ms
(5 rows)

Time: 1396.737 ms

Now, I found interesting is that if I create index just on
transaction_time_commit column (and I leave
transactions_idx__client_data index), then postgres is using that new index.

Also, if I change idx__client_data index like this (first I drop it, and
then I create new one):

CREATE INDEX transactions_idx__client_data ON transactions
USING btree (transaction_client_id, transaction_destination_id,
transaction_time_commit);

then postgres is using that index:

jura=# explain analyze select * from transactions where
transaction_time_commit between '2009-01-01' and '2009-01-31 23:59:59';

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on transactions (cost=349473.37..568973.06
rows=593713 width=91) (actual time=949.224..1128.848 rows=525051 loops=1)
Recheck Cond: ((transaction_time_commit >= '2009-01-01
00:00:00+01'::timestamp with time zone) AND (transaction_time_commit <=
'2009-01-31 23:59:59+01'::timestamp with time zone))
-> Bitmap Index Scan on transactions_idx__client_data
(cost=0.00..349324.94 rows=593713 width=0) (actual time=947.678..947.678
rows=525051 loops=1)
Index Cond: ((transaction_time_commit >= '2009-01-01
00:00:00+01'::timestamp with time zone) AND (transaction_time_commit <=
'2009-01-31 23:59:59+01'::timestamp with time zone))
Total runtime: 1234.989 ms
(5 rows)

Time: 1235.727 ms

Now, I have many 'selects' on the transactions table (still, not as many
as inserts), mostly filtered on transaction_time, client_id,
destination_id and application_id, but there is fair amount of 'selects'
filtered only on transaction_time.

Now, shall I keep the original index and add another one on just
transaction_time (there is, I guess, overhead of maintaining two
indices), or shall I remove transaction_time from original index, and
create another one?

And, is it normal for postgres to 'ignore' the transaction_time column
in original index?

This is the postgres version I'm using:
jura=# select version();
version

----------------------------------------------------------------------------------------------------------------
PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
20061115 (prerelease) (Debian 4.1.1-21)
(1 row)

Mike

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Grzegorz Jaśkiewicz 2009-02-06 15:55:03 Re: Postgres not willing to use an index?
Previous Message Bruce Momjian 2009-02-06 15:27:13 Re: suggestions for postgresql setup on Dell 2950 , PERC6i controller