Use of BETWEEN with identical values

From: André Volpato <andre(dot)volpato(at)ecomtecnologia(dot)com(dot)br>
To: pgsql-performance(at)postgresql(dot)org
Subject: Use of BETWEEN with identical values
Date: 2009-09-17 14:53:43
Message-ID: 4AB24D77.30908@ecomtecnologia.com.br
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

Is there any downsides of using BETWEEN two identical values ?

(Postgres 8.3.6, Debian Linux 2.6.18-6-amd64)

The problem is in this index:
CREATE INDEX ibds_contratacao_fatura1
ON bds_contratacao_fatura USING btree (fat_referencia);

"fat_referencia" is a field that tells year and month, and the
conditions are generated by an external application.
This field is not PK, and repeats only 29 times:
# select count(distinct(fat_referencia)) from bds_contratacao_fatura;
count
-------
29

Below is the result of explians, and the index conditions:

Condition 1:
# select fat_referencia from bds_contratacao_fatura where fat_referencia
BETWEEN 200908 AND 200908;
Index Scan using ibds_contratacao_fatura1 on bds_contratacao_fatura
(cost=0.00..5.64 rows=1 width=4) (actual time=0.023..79.952 rows=163689
loops=1)
Index Cond: ((fat_referencia >= 200908) AND (fat_referencia <= 200908))
Total runtime: 110.470 ms

Condition 2:
# select fat_referencia from bds_contratacao_fatura where fat_referencia
BETWEEN 200906 AND 200908;
Index Scan using ibds_contratacao_fatura1 on bds_contratacao_fatura
(cost=0.00..14773.88 rows=414113 width=4) (actual time=8.450..653.882
rows=496723 loops=1)
Index Cond: ((fat_referencia >= 200906) AND (fat_referencia <= 200908))
Total runtime: 748.314 ms

Condition 3:
# select fat_referencia from bds_contratacao_fatura where fat_referencia
= 200908;
Index Scan using ibds_contratacao_fatura1 on bds_contratacao_fatura
(cost=0.00..4745.07 rows=142940 width=4) (actual time=0.022..77.818
rows=163689 loops=1)
Index Cond: (fat_referencia = 200908)
Total runtime: 108.292 ms

I expect Postgres would give me the same plan in conditions 1 and 3.
In condition 2, the plan seems ok and well estimated.
The solution per now is change the application to use "BETWEEN" olny
when year and month are not the same.

How can condition 1 be so badly estimated?

--

[]´s,

André Volpato

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Hrishikesh (हृषीकेश मेहेंदळे) 2009-09-17 14:54:39 Re: optimizing for temporal data behind a view
Previous Message Andy Colson 2009-09-17 13:45:43 Re: Possible causes of sometimes slow single-row UPDATE with trivial indexed condition?