Re: Remove hacks for old bad qsort() implementations?

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Remove hacks for old bad qsort() implementations?
Date: 2008-06-23 20:21:01
Message-ID: 200806232021.m5NKL1q11089@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Added to TODO:

* Consider whether duplicate keys should be sorted by block/offset

http://archives.postgresql.org/pgsql-hackers/2008-03/msg00558.php

---------------------------------------------------------------------------

Tom Lane wrote:
> There are several places in tuplesort.c (and perhaps elsewhere) where
> we explicitly work around limitations of various platforms' qsort()
> functions. Notably, there's this bit in comparetup_index_btree
>
> /*
> * If key values are equal, we sort on ItemPointer. This does not affect
> * validity of the finished index, but it offers cheap insurance against
> * performance problems with bad qsort implementations that have trouble
> * with large numbers of equal keys.
> */
>
> which I unquestioningly copied into comparetup_index_hash yesterday.
> However, oprofile is telling me that doing this is costing
> *significantly* more than just returning zero would do:
>
> 9081 0.3050 : tuple1 = (IndexTuple) a->tuple;
> 3759 0.1263 : tuple2 = (IndexTuple) b->tuple;
> :
> : {
> 130409 4.3800 : BlockNumber blk1 = ItemPointerGetBlockNumber(&tuple1->t_tid);
> 34539 1.1601 : BlockNumber blk2 = ItemPointerGetBlockNumber(&tuple2->t_tid);
> :
> 3281 0.1102 : if (blk1 != blk2)
> 812 0.0273 : return (blk1 < blk2) ? -1 : 1;
> : }
> : {
> 28 9.4e-04 : OffsetNumber pos1 = ItemPointerGetOffsetNumber(&tuple1->t_tid);
> 1 3.4e-05 : OffsetNumber pos2 = ItemPointerGetOffsetNumber(&tuple2->t_tid);
> :
> 1 3.4e-05 : if (pos1 != pos2)
> 48757 1.6376 : return (pos1 < pos2) ? -1 : 1;
> : }
> :
> : return 0;
> 56705 1.9045 :}
>
> Looks to me like we're eating more than seven percent of the total
> runtime to do this :-(
>
> Now as far as I can see, the original motivation for this (as stated in
> the comment) is entirely dead anymore, since we always use our own qsort
> implementation in preference to whatever bogus version a given libc
> might supply. What do people think of removing this bit of code in
> favor of just returning 0?
>
> I can see a couple of possible objections:
>
> 1. Someday we might go back to using platform qsort. (But surely we
> could insist on qsort behaving sanely for equal keys.)
>
> 2. If you've got lots of equal keys, it's conceivable that having the
> index entries sorted by TID offers some advantage in indexscan speed.
> I'm dubious that that's useful, mainly because the planner should prefer
> a bitmap scan in such a case; and anyway the ordering is unlikely to
> be preserved for long. But it's something to think about.
>
> Comments?
>
> regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2008-06-23 21:35:43 Re: Database owner installable modules patch
Previous Message Bruce Momjian 2008-06-23 20:14:39 Re: Dept of ugly hacks: eliminating padding space in system indexes