query very slow when enable_seqscan=on

From: Tomasz Ostrowski <tometzky(at)batory(dot)org(dot)pl>
To: pgsql-bugs(at)postgresql(dot)org
Subject: query very slow when enable_seqscan=on
Date: 2006-07-03 20:31:07
Message-ID: 20060703203105.GA2801@batory.org.pl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-performance

I have a very slow query when enable_seqscan=on and very fast when
enable_seqscan=off. My schema looks like this (relevant columns
only):

create table organizations (
organization_id serial primary key,
organization varchar(200) not null,
organization_location varchar(55) not null
-- and several irrelevant columns
); -- about 9000 records
create table persons (
person_id serial primary key,
surname varchar(50) not null,
forename varchar(35) not null,
organization_id int references organizations,
-- and several irrelevant columns
); -- about 6500 records
create index persons_surname_forename_person_id on persons (
surname,
forename,
lpad(person_id,10,'0')
); -- I was hoping this would speed up array comparisions

The query looking for a position of a person of given person_id in a
list sorted by surname, forename and person_id and filtered by some
criteria. In this example person_id=1, forename~*'to' (about 400
people) and organization_location~*'warszawa' (about 2000
organizations):

select count(*) as position from (select
person_id, surname, forename
from persons
natural left join organizations
where forename~*'to' and organization_location~*'warszawa'
) as person_filter
where array[surname, forename, lpad(person_id,10,'0')]
<
(select array[surname, forename, lpad(person_id,10,'0')]
from persons where person_id=1);

This query take about 30 seconds when enable_seqscan=on and 65
milliseconds when off.

When enable_seqscan=on:
Aggregate (cost=785.72..785.73 rows=1 width=0) (actual time=27948.955..27948.956 rows=1 loops=1)
InitPlan
-> Index Scan using persons_pkey on persons (cost=0.00..3.11 rows=1 width=26) (actual time=0.019..0.019 rows=0 loops=1)
Index Cond: (person_id = 1)
-> Nested Loop (cost=0.00..782.60 rows=1 width=0) (actual time=27948.939..27948.939 rows=0 loops=1)
Join Filter: ("inner".organization_id = "outer".organization_id)
-> Seq Scan on organization (cost=0.00..480.95 rows=1 width=4) (actual time=0.071..69.702 rows=1892 loops=1)
Filter: ((organization_location)::text ~* 'warszawa'::text)
-> Seq Scan on persons (cost=0.00..296.10 rows=444 width=4) (actual time=14.720..14.720 rows=0 loops=1892)
Filter: (((forename)::text ~* 'to'::text) AND (ARRAY[surname, forename, (lpad((person_id)::text, 10, '0'::text))::character varying] < $0))
Total runtime: 27949.106 ms

When enable_seqscan=off:
Aggregate (cost=100001710.26..100001710.27 rows=1 width=0) (actual time=66.788..66.789 rows=1 loops=1)
InitPlan
-> Index Scan using persons_pkey on persons (cost=0.00..3.11 rows=1 width=26) (actual time=0.019..0.019 rows=0 loops=1)
Index Cond: (person_id = 1)
-> Hash Join (cost=100001408.81..100001707.14 rows=1 width=0) (actual time=66.756..66.756 rows=0 loops=1)
Hash Cond: ("outer".organization_id = "inner".organization_id)
-> Seq Scan on persons (cost=100000000.00..100000296.10 rows=444 width=4) (actual time=14.972..14.972 rows=0 loops=1)
Filter: (((forename)::text ~* 'to'::text) AND (ARRAY[surname, forename, (lpad((person_id)::text, 10, '0'::text))::character varying] < $0))
-> Hash (cost=1408.81..1408.81 rows=1 width=4) (actual time=51.763..51.763 rows=1892 loops=1)
-> Index Scan using organizations_pkey on organizations (cost=0.00..1408.81 rows=1 width=4) (actual time=0.049..48.233 rows=1892 loops=1)
Filter: ((organization_location)::text ~* 'warszawa'::text)
Total runtime: 66.933 ms

Database is properly analyzed. postgresql-8.1.4 on Fedora Core 4.

Regards
Tometzky

PS. Actual table and column names are different (they're in Polish)
but I've translated them for better readability for english-speaking.

PS. I wonder if it makes sense to "enable_seqscan=off" for every client
if a database is small enough to fit in OS cache.
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
Winnie the Pooh

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2006-07-03 22:52:59 Re: BUG #2510: ERROR: out of memory DETAIL: Failed on request of size 825242672.
Previous Message Simon Riggs 2006-07-03 20:21:48 Re: BUG #2511: violation of primary key on update with 2

Browse pgsql-performance by date

  From Date Subject
Next Message Simon Riggs 2006-07-03 23:05:42 Re: query very slow when enable_seqscan=on
Previous Message Craig A. James 2006-07-03 17:35:55 Re: Is postgresql ca do the job for software deployed in