Skip site navigation (1) Skip section navigation (2)

select ... where <column>='' does a seqscan [auf Viren geprueft]

From: Silvio Matthes <silvio(dot)matthes(at)xcom(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: select ... where <column>='' does a seqscan [auf Viren geprueft]
Date: 2004-08-24 09:59:45
Message-ID: OFDEF180CE.2A889972-ONC1256EFA.00346639-C1256EFA.0036E916@xcom.de (view raw or flat)
Thread:
Lists: pgsql-general
Hello,

I'm new to the list and did not find a suitable answer to my question so 
here it is:

I try to select the rows of a table where the content of a varchar-column 
is empty ('') and PostgresQL is doing a seqscan.

I've tried this on a PostgresQL-Serverversion 7.3.4 and 8.0 beta1.

To understand my example better, here's the layout of the content of this 
table:

I have a table document. For each document there are a couple of 
parameters in table document_param (name-value-pairs).
The table now holds 7 million rows, so a seqscan is quite expensive.
Now some values are empty (as in this example is the param_value of 
READ_DATE).

     document_param_id     |        document_id        | param_name | 
param_value    |
---------------------------+---------------------------+------------+------------------+
 1010110101000000007482877 | 1010110101000000001090647 | KONTO_NR   | 
1000000000       |
 1010110101000000007482878 | 1010110101000000001090647 | KZ_READ    | N    
      |
 1010110101000000007482879 | 1010110101000000001090647 | READ_DATE  |    |
 1010110101000000007482880 | 1010110101000000001090647 | ENTAX_NR   | 
2000000000000000 |
 1010110101000000007482881 | 1010110101000000001090647 | DOC_SOURCE | 400  
           |
 1010110101000000007482882 | 1010110101000000001090647 | KUNDEN_NR  | 
1000000          |


I want to get all rows where for example the 'KONTO_NR' is empty.
I tried this with: 

explain select * from document_params where param_name='KONTO_NR' and 
param_value='';

this produced:

                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Seq Scan on document_params  (cost=0.00..241600.27 rows=152073 width=95)
   Filter: (((param_name)::text = 'KONTO_NR'::text) AND 
((param_value)::text = ''::text))
(2 rows)

There's a multicolumn index (param_value,param_name) on document_params. 
To be sure the index works I changed the select to 

explain select * from document_params where param_name='KONTO_NR' and 
param_value=' ';

(looking for a space in param_value), and viola, the index is used. But 
not if I'm looking for ''.

Following a reply to an similar question where NULL-values where wanted, I 
made a partial index:

create index idx_empty on document_params(param_name) where 
param_value='';

But PostgresQL does not use it.
When I disable seqscan (set enable_seqscan=false), an explain returns 
this:

                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Index Scan using idx_empty on document_params  (cost=0.00..591783.84 
rows=152073 width=95)
   Index Cond: ((param_name)::text = 'KONTO_NR'::text)
   Filter: ((param_value)::text = ''::text)

So using the index does need more time than a sequential scan?


How can I get the rows with empty values from the table without doing a 
seqscan?

Any help would be highly appreciated.

Ciao,

Silvio Matthes

Responses

pgsql-general by date

Next:From: Richard HuxtonDate: 2004-08-24 10:04:48
Subject: Re: pg_dump in stand alone backend
Previous:From: David Suela FernándezDate: 2004-08-24 09:34:30
Subject: problems with pg_dump

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group