Skip site navigation (1) Skip section navigation (2)

Re: select ... where <column>='' does a seqscan [auf Viren

From: Silvio Matthes <silvio(dot)matthes(at)xcom(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: select ... where <column>='' does a seqscan [auf Viren
Date: 2004-08-24 16:34:03
Message-ID: OF2F51FA2E.13E90045-ONC1256EFA.0056AEE1-C1256EFA.005B0241@xcom.de (view raw or flat)
Thread:
Lists: pgsql-general
Hello Tom,
>> I try to select the rows of a table where the content of a 
varchar-column 
>> is empty ('') and PostgresQL is doing a seqscan.

>If there are a very large number of rows with param_value='', it's
>entirely possible that using an index to find them is counterproductive.

That's right. I did some research on my database, that's what I found.


select count(*) from document_params;

  count
---------
 7302418

select param_name,count(param_name) from document_params where 
param_value='' group by param_name;

 param_name |  count
------------+---------
 READ_DATE  | 1064944
 ENTAX_NR   |   85853
 KONTO_NR   |    6672
 KUNDEN_NR  |       7


So it's ok not using an index for 

select count(*) from document_params where param_value='';


But 

explain select count(*) from document_params where param_name='KUNDEN_NR' 
and param_value='';

also did a seqscan.

And 

explain select count(*) from document_params where param_name='KONTO_NR' 
and param_value='test';

did an indexscan!


So at the moment it seems to me, that the multicolumn index is not working 
in the expected way.
The ''-value is not the problem. The problem is that there are a million 
rows with ''-value.

But in my opinion with the multicolumn index in mind the server should do 
a index scan, because there are only 7 rows with param_name='KUNDEN_NR' 
and param_value=''?!?

Disabling seqscan gives:

explain select count(*) from document_params where param_name='KUNDEN_NR' 
and param_value='';
                                             QUERY PLAN
-----------------------------------------------------------------------------------------------------
 Aggregate  (cost=639336.57..639336.57 rows=1 width=0)
   ->  Index Scan using dp_idx_6 on document_params  (cost=0.00..638934.84 
rows=160694 width=0)
         Index Cond: (((param_value)::text = ''::text) AND 
((param_name)::text = 'KUNDEN_NR'::text))

Postmaster is using the multicolumn index (param_value,param_name), but 
not in the expected time...


as always, any help would be highly appreciated.


Ciao,

Silvio Matthes








Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 
Gesendet von: pgsql-general-owner(at)postgresql(dot)org
24.08.2004 17:35

An
Silvio Matthes <silvio(dot)matthes(at)xcom(dot)de>
Kopie
pgsql-general(at)postgresql(dot)org
Thema
Re: [GENERAL] select ... where <column>='' does a seqscan [auf Viren 
geprueft]






Silvio Matthes <silvio(dot)matthes(at)xcom(dot)de> writes:
> I try to select the rows of a table where the content of a 
varchar-column 
> is empty ('') and PostgresQL is doing a seqscan.

If there are a very large number of rows with param_value='', it's
entirely possible that using an index to find them is counterproductive.

                                                 regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faqs/FAQ.html

In response to

Responses

pgsql-general by date

Next:From: Eduardo S. FontanettiDate: 2004-08-24 16:36:18
Subject: Re: [GENERAL] Dump and Restore
Previous:From: Silvio MatthesDate: 2004-08-24 16:33:50
Subject: WG: Re: Re: select ... where <column>='' does a seqscan [auf Viren geprueft]

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group