Re: Sorting Improvements for 8.4

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
Cc: Michał Zaborowski <michal(dot)zaborowski(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sorting Improvements for 8.4
Date: 2007-12-19 23:03:16
Message-ID: 1198105396.10057.23.camel@dogma.ljc.laika.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2007-12-19 at 15:51 -0500, Mark Mielke wrote:
> That sounds possible, but I still feel myself suspecting that disk
> reads will be much slower than localized text comparison. Perhaps I am
> overestimating the performance of the comparison function?

I think this simple test will change your perceptions:

Do an initdb with --locale="en_US.UTF-8" and start postgres.

test=> create table sorter(t text, b bytea, f float); CREATE TABLE
test=> insert into sorter select r AS rt, r::text::bytea AS rb, r AS rf
from (select random() as r from generate_series(1,1000000)) a;
INSERT 0 1000000
test=> select pg_size_pretty(pg_total_relation_size('sorter'));
pg_size_pretty
----------------
70 MB
(1 row)

test=> explain analyze select * from sorter order by t;
test=> explain analyze select * from sorter order by b;
test=> explain analyze select * from sorter order by f;

On my machine this table fits easily in memory (so there aren't any disk
reads at all). Sorting takes 7 seconds for floats, 9 seconds for binary
data, and 20 seconds for localized text. That's much longer than it
would take to read that data from disk, since it's only 70MB (which
takes a fraction of a second on my machine).

I think this disproves your hypothesis that sorting happens at disk
speed.

> Yep - I started to read up on it. It still sounds like it's a hard-ish
> problem (to achieve near N times speedup for N CPU cores without
> degrading performance for existing loads), but that doesn't mean
> impossible. :-)
>

You don't even need multiple cores to achieve a speedup, according to
Ron's reference.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2007-12-19 23:09:43 Re: Sorting Improvements for 8.4
Previous Message Dann Corbit 2007-12-19 22:41:37 Re: Sorting Improvements for 8.4