Re: Re: Mapping output from a SEQUENCE into something non-repeating/colliding but random-looking?

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Bill Moran <wmoran(at)potentialtech(dot)com>
Cc: Jasen Betts <jasen(at)xnet(dot)co(dot)nz>, pgsql-general(at)postgresql(dot)org
Subject: Re: Re: Mapping output from a SEQUENCE into something non-repeating/colliding but random-looking?
Date: 2009-05-01 13:06:14
Message-ID: 49FAF3C6.5000707@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Bill Moran wrote:

> Sounds like you're reinventing message digests ...

Message digests do not guarantee non-colliding output for a given input.

They provide a result UNLIKELY to collide for any reasonable set of
inputs. They need to produce quite large output values to do this
effectively. Truncating the digest to fit the desired data type doesn't
help, if you have to do this.

A message digest is intended for use where the inputs are arbitrarily
large and must be condensed down to a generally fixed-length small-ish
value that should be very unlikely to collide for any two given inputs.

What I'm looking for is a function that, given an input within a
constrained range (say, a 32 bit integer) produces a different output
within the same range. For any given input, the output should be the
same each time, and for any given output there should only be one input
that results in that output.

So far, picking a suitable value to xor the input with seems like it'll
be better than nothing, and good enough for the casual examination
that's all I'm required to care about.

So long as I don't call it "xor encryption" ... sigh.

> Most of the systems I've seen like this do one of a few things:
> * Start with an arbitrary # like 1000
> * Prepend the date (pretty common for invoice #s) like 20090501001
> * Just start with #1 ... I mean, what's the big deal?

I'm not the one who cares. Alas, I've been given requirements to
satisfy, and one of the major ones is that customer numbers in
particular must be non-sequential (as close to random-looking as
possible) and allocated across a large range.

--
Craig Ringer

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Daniel Verite 2009-05-01 13:07:32 Re: Pgsql errors, DBI and CGI::Carp
Previous Message Grzegorz Jaśkiewicz 2009-05-01 12:57:03 xml not enabled by default on rhel4 packages from commandprompt