Re: Draft release notes for next week's releases

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Oleg Bartunov <obartunov(at)gmail(dot)com>, Thomas Kellerer <spam_eater(at)gmx(dot)net>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Draft release notes for next week's releases
Date: 2016-04-14 23:42:15
Message-ID: CAM3SWZSSA_2dW=6-ebRB0dfL0qaqJO2SYHE3sNoHMxqzqXZQyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 29, 2016 at 5:18 AM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
> It's based on https://people.freebsd.org/~girgen/postgresql-icu/readme.html
> work, and it was migrated to 9.5 with abbrevation keys support.
> Patch in current state is not ready to commit, of course.

Cool.

Some quick observations on this:

* We need to have a strxfrm_l_icu(), not just a strxfrm_icu(). That seems easy.

* We should look into using the ucol_nextSortKeyPart() API:

http://userguide.icu-project.org/collation/architecture#TOC-Partial-sort-keys

I think that this could be a lot faster, because we only need a part
of the collation tables in CPU cache during the generation of
abbreviated keys. There is an optimization described at a low level
here:

https://github.com/icu-project/icu4c/blob/bbd17a792336de5873550794f8304a4b548b0663/source/i18n/collationkeys.cpp#L337

I think this could make our special strxfrm() (which only actually
needs 8 bytes for abbreviated keys) a lot faster. I'd be interested to
see how your Russian text example does with this extra optimization.
We should not be surprised that this kind of support exists within
ICU, because abbreviated keys are actually quite an old idea.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-04-15 00:02:49 Re: Draft release notes for next week's releases
Previous Message Michael Paquier 2016-04-14 23:13:08 Re: Why doesn't src/backend/port/win32/socket.c implement bind()?