Index Scan Backward vs. Sort/Sequential Scan when using ORDER BY

From: Keith Bussey <kbussey(at)wisol(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Index Scan Backward vs. Sort/Sequential Scan when using ORDER BY
Date: 2001-08-30 16:38:48
Message-ID: 01083012294300.06825@cronus.wisol.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Greetings,

I have stumbled upon a confusing aspect of PostgreSQL queries involving ORDER
BY.

In trying to figure out just why my ORDER BY queries were so slow, I came
across something interesting.

First, let me give you 2 very similar queries:

1) SELECT p.uid
FROM client_profiles p
INNER JOIN client_profiles_2 c
USING(uid)
WHERE (p.profiles_gender='M')
AND (p.profiles_orientation[2] = 'F' OR p.profiles_orientation[1]='M')
ORDER BY c.profiles_2_last_update DESC
LIMIT 5;

2) SELECT p.uid
FROM client_profiles p
INNER JOIN client_profiles_2 c
USING(uid)
WHERE (p.profiles_gender='F')
AND (p.profiles_orientation[2] = 'F' OR p.profiles_orientation[1]='M')
ORDER BY c.profiles_2_last_update DESC
LIMIT 5;

The only difference is in #1, p.profiles_gender='M' while in #2
p.profiles_gender='F'.

SELECT count(uid)
FROM client_profiles
WHERE profiles_gender='M';
----------------------
408526

SELECT count(uid)
FROM client_profiles
WHERE profiles_gender='F';
----------------------
54713

Here are the EXPLAINs:

1) EXPLAIN #1:
Limit (cost=0.00..1763.83 rows=5 width=24)
-> Nested Loop (cost=0.00..2203068.58 rows=6245 width=24)
-> Index Scan Backward using index_client_profiles_2_last_up on
client_profiles_2 c (cost=0.00..239553.52 rows=394263 width=16)
-> Index Scan using client_profiles_pkey on client_profiles p
(cost=0.00..4.97 rows=1 width=8)

2) EXPLAIN #2:
Limit (cost=36046.44..36046.44 rows=5 width=24)
-> Sort (cost=36046.44..36046.44 rows=160 width=24)
-> Nested Loop (cost=0.00..36040.58 rows=160 width=24)
-> Index Scan using index_client_profiles_gender on
client_profiles p (cost=0.00..35064.98 rows=198 width=8)
-> Index Scan using client_profiles_2_pkey on
client_profiles_2 c (cost=0.00..4.91 rows=1 width=16)

Now the only reason I can see to explain this is because there are many more
p.profiles_gender='M' than p.profiles_gender='F', Postgres knows its faster
to do a Index Scan Backward than a Sort/Sequential Scan (and trust me, it
is!). However it thinks the opposite when I am searching for
p.profiles_gender='F' and form my results, that just isn't true.

Does anyone have any insight as to how I can 'force' postgres to use Index
Backward Scan for #2???

Or, perhaps another method of making my ORDER BY faster ??

Your help would be greatly appreciated, Thanks

--
Keith Bussey
kbussey(at)wisol(dot)com
Programmer - WISOL.com
(514) 398-9994 ext. 225

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Josh Berkus 2001-08-30 16:56:43 Re: Problem with sequences
Previous Message Giorgio Volpe 2001-08-30 13:35:51 Apache authentication with debian linux