WG: Re: Re: select ... where <column>='' does a seqscan [auf Viren geprueft]

From: Silvio Matthes <silvio(dot)matthes(at)xcom(dot)de>
To: pgsql-general(at)postgresql(dot)org
Subject: WG: Re: Re: select ... where <column>='' does a seqscan [auf Viren geprueft]
Date: 2004-08-24 16:33:50
Message-ID: OF44F87C3C.9D34FDA4-ONC1256EFA.005AF6EE-C1256EFA.005AFD20@xcom.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>Am Dienstag, 24. August 2004 11:59 schrieb Silvio Matthes:
>> So using the index does need more time than a sequential scan?

>It's possible. If you want to prove the opposite, please post the output
of
>EXPLAIN ANALYZE in both cases.

On my system, with PostgresQL 8.0Beta1, I could prove the opposite, but
that's not the point for me.
Both cases are too slow:

set enable_seqscan=true
explain analyze select count(*) from document_params where
param_name='KONTO_NR' and param_value='';

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=241980.45..241980.45 rows=1 width=0) (actual
time=47766.000..47766.000 rows=1 loops=1)
-> Seq Scan on document_params (cost=0.00..241600.27 rows=152073
width=0) (actual time=0.000..47719.000 rows=6672 loops=1)
Filter: (((param_name)::text = 'KONTO_NR'::text) AND
((param_value)::text = ''::text))
Total runtime: 47938.000 ms
(4 rows)

set enable_seqscan=false
explain analyze select count(*) from document_params where
param_name='KONTO_NR' and param_value='';

QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=592164.03..592164.03 rows=1 width=0) (actual
time=31828.000..31828.000 rows=1 loops=1)
-> Index Scan using idx_empty on document_params (cost=0.00..591783.84
rows=152073 width=0) (actual time=140.000..31782.000 rows=6672 loops=1)
Index Cond: ((param_name)::text = 'KONTO_NR'::text)
Filter: ((param_value)::text = ''::text)
Total runtime: 31828.000 ms
(5 rows)

In other tests the server cached the responses so I cannot compare the
results, in other cases the server did need some 20s for an index-scan,
probably due to too low configured memory.

This brings me to the following questions:

1.) What can I do to avoid seqscans if I want rows with empty param_value?

2.) Would it be wiser to change all the empty values to some placeholder
(e.g. 'empty')?
From the tests made up to now, this would be the fastest solution, but
wouldn't be very nice in concern of readability...

a bit offtopic, but related for me:
3.) how can I prevent the server from caching or empty the cache under
win32 (stopping and re-starting the service doesn't help) for comparing
similar statements?
4.) how can I change the memory settings for the postmaster (v8.0beta) in
win32?

As always, any help would be highly appreciated,

Ciao,

Silvio Matthes

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Silvio Matthes 2004-08-24 16:34:03 Re: select ... where <column>='' does a seqscan [auf Viren
Previous Message Tom Lane 2004-08-24 16:30:57 Re: Invalid input for integer on VIEW