Re: Why is sorting on two columns so slower thansortingon one column?

From: Jie Li <jay23jack(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kenneth Marshall <ktm(at)rice(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why is sorting on two columns so slower thansortingon one column?
Date: 2010-12-24 05:27:31
Message-ID: AANLkTik6tPNnOkRO2kzLinLMe=UY3m8nmWZmpqb1envG@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 23, 2010 at 10:26 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Kenneth Marshall <ktm(at)rice(dot)edu> writes:
> > On Thu, Dec 23, 2010 at 10:42:26PM +0800, Li Jie wrote:
> >> But in the last query that sorts on "id", since the query selects all
> the columns for output, the actual sorted size is the same, and the only
> difference is the comparison cost. The query sorting on two columns needs to
> do twice the comparison. Am I right?
>
> > I think you are right. Sorry for the confusion.
>
> I doubt the cost of comparing two integers is the issue here; rather
> it's more likely one of how many merge passes were needed. You could
> find out instead of just speculating by turning on trace_sort and
> comparing the log outputs.
>
> regards, tom lane
>

I follow your advice and set the trace_sort, here is the two queries result,
the log outputs looks similar and they seem to use the same number of
tapes. But the time is different. Could you have a look:

postgres=# explain analyze select * from big_wf order by id;
QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------
Sort (cost=565525.45..575775.45 rows=4100000 width=8) (actual
time=28102.985..39351.309 rows=4100000 loops=1)
Sort Key: id
Sort Method: external merge Disk: 72048kB
-> Seq Scan on big_wf (cost=0.00..59142.00 rows=4100000 width=8) (actual
time=9.190..7262.789 rows=4100000 loops=1)
Total runtime: 42953.855 ms
(5 rows)
STATEMENT: explain analyze select * from big_wf order by id;
LOG: begin tuple sort: nkeys = 1, workMem = 20480, randomAccess = f
STATEMENT: explain analyze select * from big_wf order by id;
LOG: switching to external sort with 74 tapes: CPU 0.29s/0.28u sec elapsed
0.71 sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: finished writing run 1 to tape 0: CPU 0.68s/2.12u sec elapsed 3.41 sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: finished writing run 2 to tape 1: CPU 1.22s/4.24u sec elapsed 6.53 sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: finished writing run 3 to tape 2: CPU 1.67s/6.38u sec elapsed 9.67 sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: finished writing run 4 to tape 3: CPU 2.22s/8.61u sec elapsed 12.93
sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: finished writing run 5 to tape 4: CPU 2.72s/10.80u sec elapsed 16.09
sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: finished writing run 6 to tape 5: CPU 3.23s/12.86u sec elapsed 19.12
sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: finished writing run 7 to tape 6: CPU 3.81s/15.02u sec elapsed 23.84
sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: finished writing run 8 to tape 7: CPU 4.36s/17.13u sec elapsed 27.05
sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: performsort starting: CPU 4.38s/17.20u sec elapsed 27.15 sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: finished writing run 9 to tape 8: CPU 4.39s/17.88u sec elapsed 27.97
sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: finished writing final run 10 to tape 9: CPU 4.39s/17.88u sec elapsed
27.97 sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: performsort done (except 10-way final merge): CPU 4.39s/17.98u sec
elapsed 28.10 sec
STATEMENT: explain analyze select * from big_wf order by id;
LOG: external sort ended, 9006 disk blocks used: CPU 8.01s/27.02u sec
elapsed 42.92 sec
STATEMENT: explain analyze select * from big_wf order by id;

postgres=# explain analyze select * from big_wf order by age,id;
Sort (cost=565525.45..575775.45 rows=4100000 width=8) (actual
time=43709.851..57048.645 rows=4100000 loops=1)
Sort Key: age, id
Sort Method: external merge Disk: 72048kB
-> Seq Scan on big_wf (cost=0.00..59142.00 rows=4100000 width=8) (actual
time=10.090..7075.208 rows=4100000 loops=1)
Total runtime: 60721.824 ms

STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: begin tuple sort: nkeys = 2, workMem = 20480, randomAccess = f
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: switching to external sort with 74 tapes: CPU 0.28s/0.30u sec elapsed
0.67 sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: finished writing run 1 to tape 0: CPU 0.71s/3.63u sec elapsed 4.90 sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: finished writing run 2 to tape 1: CPU 1.30s/7.53u sec elapsed 10.30
sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: finished writing run 3 to tape 2: CPU 1.88s/11.36u sec elapsed 15.39
sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: finished writing run 4 to tape 3: CPU 2.43s/15.20u sec elapsed 20.51
sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: finished writing run 5 to tape 4: CPU 3.05s/18.96u sec elapsed 25.44
sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: finished writing run 6 to tape 5: CPU 3.68s/22.74u sec elapsed 30.47
sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: finished writing run 7 to tape 6: CPU 4.24s/26.63u sec elapsed 36.61
sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: finished writing run 8 to tape 7: CPU 4.78s/30.41u sec elapsed 41.56
sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: performsort starting: CPU 4.81s/30.56u sec elapsed 41.75 sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: finished writing run 9 to tape 8: CPU 4.84s/32.07u sec elapsed 43.56
sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: finished writing final run 10 to tape 9: CPU 4.84s/32.07u sec elapsed
43.56 sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: performsort done (except 10-way final merge): CPU 4.85s/32.16u sec
elapsed 43.70 sec
STATEMENT: explain analyze select * from big_wf order by age,id;
LOG: external sort ended, 9006 disk blocks used: CPU 8.60s/41.93u sec
elapsed 60.73 sec
STATEMENT: explain analyze select * from big_wf order by age,id;

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KaiGai Kohei 2010-12-24 07:24:42 Re: sepgsql contrib module
Previous Message Stephen Frost 2010-12-24 04:46:30 Re: Streaming replication as a separate permissions