Mapping output from a SEQUENCE into something non-repeating/colliding but random-looking?

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: pgsql-general(at)postgresql(dot)org
Subject: Mapping output from a SEQUENCE into something non-repeating/colliding but random-looking?
Date: 2009-05-01 04:32:46
Message-ID: 49FA7B6E.9040005@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

[reposting as the original seemed to get lost in a mail system issue on
my end. Sorry if it did make it.]

Hi

This must be a fairly common requirement, but either I don't know how to
ask Google about it or there's not as much out there as I would've
expected. At a high level, what I need is a way to "scramble" sequences
so it's not obvious that, say, customer 1332 is in fact the
one-thousand-and-thirty-second customer of the company, because the next
customer signed up might well get customer number 12, or 99442312.

I'm hoping to do this with a mapping of values from a normal sequence,
so that I can preserve the concurrency-friendly properties of normal
sequences and not have to deal with the nightmare of
gapless-sequence-like code.

So: I'm looking for a way to map the output from a monotonically
increasing sequence (not necessarily gapless - ie a normal Pg SEQUENCE)
within a particular range into a random-looking different value in the
same range with a 1:1 input->output relationship. In other words, for
the 32 bit integer input of (say) "27" the output will always be the
same (say 32 bit) number, and no other input will produce that output. A
reverse mapping is not necessary; I don't care about being able to find
out what input produced the output (say) 41231.

Note that I'm *NOT* looking for a PRNG that takes the previous output as
its input. That'd force me to use the same techniques as for a gapless
sequence in Pg, with all the associated horror with locking and
deadlocks, the performance issues, etc. That, or use a C extension
module (which I'd rather avoid for portability and future proofing) to
provide sequence-like properties.

Does anyone here know of a good algorithm to do this that doesn't just
iterate `n' times through a PRNG with the same seed, but instead does a
true non-colliding space mapping?

If I find something good and there aren't any existing Pl/PgSQL
implementations I'll post one for others' use, since I'm pretty sure it
must come up a lot. You don't want your database to send out "invoice
#1" or "customer #1" after all.

(I'm also going to be looking for efficient ways to calculate effective
check digits for arbitrary numbers within a certain range, too, and will
post something for that, but that comes later).

--
Craig Ringer

Browse pgsql-general by date

  From Date Subject
Next Message Carlo Stonebanks 2009-05-01 05:25:48 Any way to execute ad-hoc pl/pgsql?
Previous Message Craig Ringer 2009-05-01 03:31:47 Re: How to begin to debug FATAL: invalid frontend message type 77 error messages?