Skip site navigation (1) Skip section navigation (2)

Use of BETWEEN with identical values

From: André Volpato <andre(dot)volpato(at)ecomtecnologia(dot)com(dot)br>
To: pgsql-performance(at)postgresql(dot)org
Subject: Use of BETWEEN with identical values
Date: 2009-09-17 14:53:43
Message-ID: 4AB24D77.30908@ecomtecnologia.com.br (view raw or flat)
Thread:
Lists: pgsql-performance
Hi,

Is there any downsides of using BETWEEN two identical values ?

(Postgres 8.3.6, Debian Linux 2.6.18-6-amd64)


The problem is in this index:
CREATE INDEX ibds_contratacao_fatura1
  ON bds_contratacao_fatura  USING btree  (fat_referencia);

"fat_referencia" is a field that tells year and month, and the 
conditions are generated by an external application.
This field is not PK, and repeats only 29 times:
# select count(distinct(fat_referencia)) from bds_contratacao_fatura;
 count
-------
    29

Below is the result of explians, and the index conditions:

Condition 1:
# select fat_referencia from bds_contratacao_fatura where fat_referencia 
BETWEEN 200908 AND 200908;
 Index Scan using ibds_contratacao_fatura1 on bds_contratacao_fatura  
(cost=0.00..5.64 rows=1 width=4) (actual time=0.023..79.952 rows=163689 
loops=1)
   Index Cond: ((fat_referencia >= 200908) AND (fat_referencia <= 200908))
 Total runtime: 110.470 ms

Condition 2:
# select fat_referencia from bds_contratacao_fatura where fat_referencia 
BETWEEN 200906 AND 200908;
Index Scan using ibds_contratacao_fatura1 on bds_contratacao_fatura  
(cost=0.00..14773.88 rows=414113 width=4) (actual time=8.450..653.882 
rows=496723 loops=1)
   Index Cond: ((fat_referencia >= 200906) AND (fat_referencia <= 200908))
 Total runtime: 748.314 ms

Condition 3:
# select fat_referencia from bds_contratacao_fatura where fat_referencia 
= 200908;
Index Scan using ibds_contratacao_fatura1 on bds_contratacao_fatura  
(cost=0.00..4745.07 rows=142940 width=4) (actual time=0.022..77.818 
rows=163689 loops=1)
   Index Cond: (fat_referencia = 200908)
 Total runtime: 108.292 ms



I expect Postgres would give me the same plan in conditions 1 and 3.
In condition 2, the plan seems ok and well estimated.
The solution per now is change the application to use "BETWEEN" olny 
when year and month are not the same.

How can condition 1 be so badly estimated?

-- 

[]´s,

André Volpato




Responses

pgsql-performance by date

Next:From: हृषीकेश मेहेंदळ <hashinclude@gmail.com>Date: 2009-09-17 14:54:39
Subject: Re: optimizing for temporal data behind a view
Previous:From: Andy ColsonDate: 2009-09-17 13:45:43
Subject: Re: Possible causes of sometimes slow single-row UPDATE with trivial indexed condition?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group