Re: More work on SortSupport for text - strcoll() and strxfrm() caching

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: More work on SortSupport for text - strcoll() and strxfrm() caching
Date: 2015-08-04 19:41:34
Message-ID: CA+Tgmoa=PmK=myKYvF_UgWq3hiWgm94jRKAxTGVy1chjBqBi1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 3, 2015 at 8:33 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> Since apparently we're back to development work, I thought it was time
> to share a patch implementing a few additional simple tricks to make
> sorting text under a non-C locale even faster than in 9.5. These
> techniques are mostly effective when values are physically clustered
> together. This might be because there is a physical/logical
> correlation, but cases involving any kind of clustering of values are
> helped significantly.

Interesting work.

Some comments:

1. My biggest gripe with this patch is that the comments are not easy
to understand. For example:

+ * Attempt to re-use buffers across calls. Also, avoid work in the event
+ * of clustered together identical items by exploiting temporal locality.
+ * This works well with divide-and-conquer, comparison-based sorts like
+ * quicksort and mergesort.
+ *
+ * With quicksort, there is, in general, a pretty strong chance that the
+ * same buffer contents can be used repeatedly for pivot items -- early
+ * pivot items will account for large number of total comparisons, since
+ * they must be compared against many (possibly all other) items.

Well, what I would have written is something like: "We're likely to be
asked to compare the same strings repeatedly, and memcmp() is so much
cheaper than memcpy() that it pays to attempt a memcmp() in the hopes
of avoiding a memcpy(). This doesn't seem to slow things down
measurably even if it doesn't work out very often."

+ * While it is worth going to the trouble of trying to re-use buffer
+ * contents across calls, ideally that will lead to entirely avoiding a
+ * strcoll() call by using a cached return value.
+ *
+ * This optimization can work well again and again for the same set of
+ * clustered together identical attributes; as they're relocated to new
+ * subpartitions, only one strcoll() is required for each pivot (in respect
+ * of that clump of identical values, at least). Similarly, the final
+ * N-way merge of a mergesort can be effectively accelerated if each run
+ * has its own locally clustered values.

And here I would have written something like: "If we're comparing the
same two strings that we compared last time, we can return the same
answer without calling strcoll() again. This is more likely than it
seems, because quicksort compares the same pivot against many values,
and some of those values might be duplicates."

Of course everybody may prefer something different here; I'm just
telling you what I think.

2. I believe the change to bttextcmp_abbrev() should be pulled out
into a separate patch and committed separately. That part seems like
a slam dunk.

3. What is the worst case for the strxfrm()-reuse stuff? I suppose
it's the case where we have many strings that are long, all
equal-length, and all different, but only in the last few characters.
Then the memcmp() is as expensive as possible but never works out.
How does the patch hold up in that case?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-08-04 19:45:44 Re: Raising our compiler requirements for 9.6
Previous Message Jeff Janes 2015-08-04 19:38:13 Re: FSM versus GIN pending list bloat