Re: Sort and index

From: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dave Held <dave(dot)held(at)arrayservicesgrp(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Sort and index
Date: 2005-04-20 03:40:41
Message-ID: 20050420034041.GC58835@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Apr 19, 2005 at 11:01:26PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <decibel(at)decibel(dot)org> writes:
> > Actually, the planner (at least in 7.4) isn't smart enough to consider
> > if the sort would fit in memory or not.
>
> Really? Have you read cost_sort()?
>
> It's certainly possible that the calculation is all wet, but to claim
> that the issue is not considered is just wrong.

To be fair, no, I haven't looked at the code. This is based strictly on
anecdotal evidence on a 120M row table. I'm currently running a test to
see how an index scan compares to a seqscan. I also got the same results
when I added a where clause that would restrict it to about 7% of the
table.

Actually, after running some tests (below), the plan cost does change
when I change sort_mem (it was originally 50000).

stats=# \d email_contrib
Table "public.email_contrib"
Column | Type | Modifiers
------------+---------+-----------
project_id | integer | not null
id | integer | not null
date | date | not null
team_id | integer |
work_units | bigint | not null
Indexes:
"email_contrib_pkey" primary key, btree (project_id, id, date)
"email_contrib__pk24" btree (id, date) WHERE (project_id = 24)
"email_contrib__pk25" btree (id, date) WHERE (project_id = 25)
"email_contrib__pk8" btree (id, date) WHERE (project_id = 8)
"email_contrib__project_date" btree (project_id, date)
Foreign-key constraints:
"fk_email_contrib__id" FOREIGN KEY (id) REFERENCES stats_participant(id) ON UPDATE CASCADE
"fk_email_contrib__team_id" FOREIGN KEY (team_id) REFERENCES stats_team(team) ON UPDATE CASCADE

stats=# explain select * from email_contrib where project_id=8 order by project_id, id, date;
QUERY PLAN
--------------------------------------------------------------------------------
Sort (cost=3613476.05..3635631.71 rows=8862263 width=24)
Sort Key: project_id, id, date
-> Seq Scan on email_contrib (cost=0.00..2471377.50 rows=8862263 width=24)
Filter: (project_id = 8)
(4 rows)

stats=# explain select * from email_contrib order by project_id, id, date;
QUERY PLAN
----------------------------------------------------------------------------------
Sort (cost=25046060.83..25373484.33 rows=130969400 width=24)
Sort Key: project_id, id, date
-> Seq Scan on email_contrib (cost=0.00..2143954.00 rows=130969400 width=24)
(3 rows)

stats=# select 8862263::float/130969400;
?column?
--------------------
0.0676666687027657
(1 row)

stats=# explain select * from email_contrib where project_id=8 order by project_id, id, date;
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Index Scan using email_contrib_pkey on email_contrib (cost=0.00..6832005.57 rows=8862263 width=24)
Index Cond: (project_id = 8)
(2 rows)

stats=# explain select * from email_contrib order by project_id, id, date;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Index Scan using email_contrib_pkey on email_contrib (cost=0.00..100055905.62 rows=130969400 width=24)
(1 row)

stats=# set enable_seqscan=on;
SET
stats=# set sort_mem=1000;
SET
stats=# explain select * from email_contrib order by project_id, id, date;
QUERY PLAN
----------------------------------------------------------------------------------
Sort (cost=28542316.63..28869740.13 rows=130969400 width=24)
Sort Key: project_id, id, date
-> Seq Scan on email_contrib (cost=0.00..2143954.00 rows=130969400 width=24)
(3 rows)

stats=#

--
Jim C. Nasby, Database Consultant decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Stark 2005-04-20 04:34:27 Re: Slow copy with little CPU/disk usage
Previous Message Josh Berkus 2005-04-20 03:07:33 Re: What to do with 6 disks?