Re: qsort, once again

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Dann Corbit" <DCorbit(at)connx(dot)com>
Cc: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Jerry Sievers" <jerry(at)jerrysievers(dot)com>
Subject: Re: qsort, once again
Date: 2006-03-17 05:27:05
Message-ID: 25294.1142573225@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Dann Corbit" <DCorbit(at)connx(dot)com> writes:
>> So my feeling is we should just remove the swap_cnt code and return
>> to the original B&M algorithm.

> Even if "hunks" of the input are sorted, the test is a very good idea.

Yah know, guys, Bentley and McIlroy are each smarter than any five of
us, and I'm quite certain it occurred to them to try prechecking for
sorted input. If that case is not in their code then it's probably
because it's a net loss. Unless you have reason to think that sorted
input is *more* common than other cases for the Postgres environment,
which is certainly a fact not in evidence.

(Bentley was my thesis adviser for awhile before he went to Bell Labs,
so my respect for him is based on direct personal experience. McIlroy
I only know by reputation, but he's sure got a ton of that.)

> Imagine also a table that was clustered but for which we have not
> updated statistics. Perhaps it is 98% sorted. Checking for order in
> our partitions is probably a good idea.

If we are using the sort code rather than the recently-clustered index
for such a case, then we have problems elsewhere. This scenario is not
a good argument that the sort code needs to be specialized to handle
this case at the expense of other cases; the place to be fixing it is
the planner or the statistics-management code.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dann Corbit 2006-03-17 05:28:52 Re: qsort, once again
Previous Message Qingqing Zhou 2006-03-17 01:51:21 Re: Separate BLCKSZ for data and logging