PG trouble with index-usage in sort

From: Andreas Joseph Krogh <andreak(at)officenet(dot)no>
To: pgsql-sql(at)postgresql(dot)org
Subject: PG trouble with index-usage in sort
Date: 2007-11-26 21:08:01
Message-ID: 200711262208.02000.andreak@officenet.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi all.

I'm seeing something fishy when trying to self-join two large tables and then
order by one column.

I have the following schema:

CREATE TABLE origo_person(
id SERIAL PRIMARY KEY,
firstname varchar,
lastname varchar,
created timestamp not null default now(),
created_by integer REFERENCES onp_user(id),
onp_user_id integer references onp_user(id)
);

create index origo_person_created_idx on origo_person(created);
create index origo_person_createdby_idx on origo_person(created_by);
create index origo_person_onp_user_id_idx on origo_person(onp_user_id);
create index origo_person_firstname_idx on origo_person(firstname);

insert into onp_user(id) values (1);

copy origo_person (firstname, lastname, created)
from '/home/andreak/simpleperson.sql';

update origo_person set created_by = 1;

update origo_person set onp_user_id = 1 where id = 1;

simpleperson.sql has more than 200K entries in COPY-format:
$ tail -5 /home/andreak/simpleperson.sql
INGOLF KALLEBERG 2007-08-21 22:23:43.571421
SIGRUNN BRUVIK 2007-08-21 22:23:43.571421
ELFRID FROGNER 2007-08-21 22:23:43.571421
GUNNAR KRISTOFFER DOVLAND 2007-08-21 22:23:43.571421
JAN ARNE HAARR 2007-08-21 22:23:43.571421

Now, the two first queries are *fast*, but the 3rd query is slow:
1. Fast:
EXPLAIN ANALYZE SELECT p.firstname, p.lastname FROM origo_person p order by
p.firstname ASC limit 5;
QUERY
PLAN
-----------------------------------------------------------------------------
Limit (cost=0.00..0.55 rows=5 width=17) (actual time=0.031..0.070 rows=5
loops=1)
-> Index Scan using origo_person_firstname_idx on origo_person p
(cost=0.00..22277.13 rows=200827 width=17) (actual time=0.025..0.046 rows=5
loops=1)
Total runtime: 0.128 ms
(3 rows)

2. Fast:
EXPLAIN ANALYZE SELECT p.firstname, p.lastname FROM origo_person p,
origo_person pcb WHERE pcb.onp_user_id = p.created_by limit 5;

QUERY PLAN
-----------------------------------------------------------------------------
Limit (cost=0.00..31526.55 rows=1 width=17) (actual time=0.096..0.170 rows=5
loops=1)
-> Merge Join (cost=0.00..31526.55 rows=1 width=17) (actual
time=0.091..0.145 rows=5 loops=1)
Merge Cond: (p.created_by = pcb.onp_user_id)
-> Index Scan using origo_person_createdby_idx on origo_person p
(cost=0.00..10697.70 rows=200827 width=21) (actual time=0.045..0.057 rows=5
loops=1)
-> Index Scan using origo_person_onp_user_id_idx on origo_person pcb
(cost=0.00..19824.70 rows=200827 width=4) (actual time=0.032..0.044 rows=5
loops=1)
Total runtime: 0.264 ms
(6 rows)

3. Slow:
EXPLAIN ANALYZE SELECT p.firstname, p.lastname FROM origo_person p,
origo_person pcb WHERE pcb.onp_user_id = p.created_by order by p.firstname
ASC limit 5;

QUERY PLAN
-----------------------------------------------------------------------------
Limit (cost=31526.56..31526.56 rows=1 width=17) (actual
time=2573.993..2574.015 rows=5 loops=1)
-> Sort (cost=31526.56..31526.56 rows=1 width=17) (actual
time=2573.987..2573.994 rows=5 loops=1)
Sort Key: p.firstname
Sort Method: top-N heapsort Memory: 17kB
-> Merge Join (cost=0.00..31526.55 rows=1 width=17) (actual
time=0.098..2047.726 rows=200827 loops=1)
Merge Cond: (p.created_by = pcb.onp_user_id)
-> Index Scan using origo_person_createdby_idx on origo_person
p (cost=0.00..10697.70 rows=200827 width=21) (actual time=0.052..428.445
rows=200827 loops=1)
-> Index Scan using origo_person_onp_user_id_idx on
origo_person pcb (cost=0.00..19824.70 rows=200827 width=4) (actual
time=0.031..424.250 rows=200828 loops=1)
Total runtime: 2574.113 ms
(9 rows)

Can anybody point out to me why PG doesn't perform better on the last query?

--
Andreas Joseph Krogh <andreak(at)officenet(dot)no>
Senior Software Developer / Manager
------------------------+---------------------------------------------+
OfficeNet AS | The most difficult thing in the world is to |
Karenslyst Allé 11 | know how to do a thing and to watch |
PO. Box 529 Skøyen | somebody else doing it wrong, without |
0214 Oslo | comment. |
NORWAY | |
Tlf: +47 24 15 38 90 | |
Fax: +47 24 15 38 91 | |
Mobile: +47 909 56 963 | |
------------------------+---------------------------------------------+

Browse pgsql-sql by date

  From Date Subject
Next Message Ehab Galal 2007-11-26 21:17:14 materialize
Previous Message Bart Degryse 2007-11-26 08:13:22 Re: dynmic column names inside trigger?