Re: SCRAM authentication, take three

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SCRAM authentication, take three
Date: 2017-02-15 10:58:22
Message-ID: a67d873a-fa0b-ba97-b2f8-89efb0ecdf65@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02/09/2017 09:33 AM, Michael Paquier wrote:
> On Tue, Feb 7, 2017 at 11:28 AM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> Yes, I am actively working on this one now. I am trying to come up
>> first with something in the shape of an extension to begin with, and
>> get a patch out of it. That will be more simple for testing. For now
>> the work that really remains in the patches attached on this thread is
>> to get the internal work done, all the UTF8-related routines being
>> already present in scram-common.c to work on the strings.
>
> It took me a couple of days... And attached is the prototype
> implementing SASLprep(), or NFKC if you want for UTF-8 strings.

Cool!

> Now using this module I have arrived to the following conclusions to
> put to a minimum the size of the conversion tables, without much
> impacting lookup performance:
> - There are 24k characters with a combining class of 0 and no
> decomposition for a total of 30k characters, those need to be dropped
> from the conversion table.
> - Most characters have a single, or double decomposition, one has a
> decomposition of 18 characters. So we need to create two sets of
> conversion tables:
> -- A base table, with the character number (4 bytes), the combining
> class (1 byte) and the size of the decomposition (1 byte).
> -- A set of decomposition tables, classified by size.
> So when decomposing a character, we check first the size of the
> decomposition, then get the set from the correct table.

Sounds good.

> Now regarding the shape of the implementation for SCRAM, we need one
> thing: a set of routines in src/common/ to build decompositions for a
> given UTF-8 string with conversion UTF8 string <=> pg_wchar array, the
> decomposition and the reordering. The extension attached roughly
> implements that. What we can actually do as well is have in contrib/ a
> module that does NFK[C|D] using the base APIs in src/common/. Using
> arrays of pg_wchar (integers) to manipulate the characters, we can
> validate and have a set of regression tests that do *not* have to
> print non-ASCII characters.

A contrib module or built-in extra functions to deal with Unicode
characters might be handy for a lot of things. But I'd leave that out
for now, to keep this patch minimal.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-02-15 10:59:06 Re: Proposal: GetOldestXminExtend for ignoring arbitrary vacuum flags
Previous Message Amit Khandekar 2017-02-15 09:43:28 Re: Parallel Append implementation