From: | Benjamin Coutu <ben(dot)coutu(at)zeyos(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Insertion Sort Improvements |
Date: | 2022-08-25 10:55:46 |
Message-ID: | ddc4e498740a8e411c59@zeyos.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
Inspired by the recent discussions[1][2] around sort improvements, I took a look around the code and noticed the use of a somewhat naive version of insertion sort within the broader quicksort code.
The current implementation (see sort_template.h) is practically the textbook version of insertion sort:
for (pm = a + ST_POINTER_STEP; pm < a + n * ST_POINTER_STEP; pm += ST_POINTER_STEP)
for (pl = pm; pl > a && DO_COMPARE(pl - ST_POINTER_STEP, pl) > 0; pl -= ST_POINTER_STEP)
DO_SWAP(pl, pl - ST_POINTER_STEP);
I propose to rather use the slightly more efficient variant of insertion sort where only a single assignment instead of a fully-fledged swap is performed in the inner loop:
for (pm = a + ST_POINTER_STEP; pm < a + n * ST_POINTER_STEP; pm += ST_POINTER_STEP) {
DO_COPY(pm_temp, pm); /* pm_temp <- copy of pm */
pl = pm - ST_POINTER_STEP;
for (; pl >= a && DO_COMPARE(pl, pm_temp) > 0; pl -= ST_POINTER_STEP)
DO_ASSIGN(pl + ST_POINTER_STEP, pl); /* pl + 1 <- pl */
DO_COPY(pl + ST_POINTER_STEP, pm_temp); /* pl + 1 <- copy of pm_temp */
}
DO_ASSIGN and DO_COPY macros would have to be declared analogue to DO_SWAP via the template.
There is obviously a trade-off involved here as O(1) extra memory is required to hold the temporary variable and DO_COPY might be expensive if the sort element is large. In case of single datum sort with trivial data types this would not be a big issue. For large tuples on the other hand it could mean a significant overhead that might not be made up for by the improved inner loop. One might want to limit this algorithm to a certain maximum tuple size and use the original insertion sort version for larger elements (this could be decided at compile-time via sort_template.h).
Anyways, there might be some low hanging fruit here. If it turns out to be significantly faster one might even consider increasing the insertion sort threshold from < 7 to < 10 sort elements. But that is a whole other discussion for another day.
Has anyone tested such an approach before? Please let me know your thoughts.
Cheers,
Benjamin
[1] https://www.postgresql.org/message-id/flat/CAFBsxsHanJTsX9DNJppXJxwg3bU%2BYQ6pnmSfPM0uvYUaFdwZdQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/CAApHDvoTTtoQYfp3d0kTPF6y1pjexgLwquzKmjzvjC9NCw4RGw%40mail.gmail.com
--
Benjamin Coutu
http://www.zeyos.com
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2022-08-25 11:04:15 | Re: pg_receivewal and SIGTERM |
Previous Message | Richard Guo | 2022-08-25 10:27:38 | Re: Making Vars outer-join aware |