Re: UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table
Date: 2014-04-24 19:47:56
Message-ID: CAFNqd5UsHVXY8Q=WbrsFFu+pV6F1yZk+rdHd=oOu2m7giOp55w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Last year, I built a pl/pgsql generator of "version 1-ish" UUIDs, which
would combine timestamps with local information to construct data that kind
of emulated the timestamp+MAC address that is version #1 of UUID.

Note that there are several versions of UUIDs:

1. Combines MAC address, timestamp, random #
2. DCE Security (replaces some bits with user's UID/GID and others with
POSIX Domain); I don't think this one is much used...
3. MD5 Hash
4. Purely Random
5. SHA-1 Hash

There are merits to each. The "tough one" is #1, as that requires pulling
data that can't generally be accessed portably.

I figured out (and could probably donate some code) how to construct the
bits of #1 using the inputs of *my* choice (e.g. - I set up to "make up" my
own MAC address surrogate, and transformed PostgreSQL timestamp values into
the timestamp, and threw in my own bit of randomness), which provided
well-formed UUIDs with nice enough characteristics.

It wouldn't be "out there" to do a somewhat PostgreSQL-flavoured version of
this that wouldn't actually use MAC addresses, but rather, would use data
we have:

a) Having a sequence feeding some local uniqueness would fit with the
"clock seq" bits (e.g. - the octets in RFC 4122 entitled
clock-seq-and-reserved and clock-seq-low)
b) NOW() provides data for time-low, time-mid, time-high-and-version
c) We'd need 6 hex octets for "node"; I seem to recall there being
something established by initdb that might be usable.

The only piece that's directly troublesome, for UUID Type 1, is the "node"
value. I'll observe that it isn't unusual for UUID implementations to
generate random values for that.

Note that for the other UUID versions, there's NO non-portable data needed.

It seems to me that a "UUIDserial" type, which combined:
a) A sequence, to be the 'clock';
b) Possibly another sequence to store local node ID, which might get
seeded from DB internals
would provide a "PostgreSQL-flavoured" version of UUID Type 1.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-04-24 20:21:14 Re: assertion failure 9.3.4
Previous Message Tom Lane 2014-04-24 17:45:26 Re: 9.4 Proposal: Initdb creates a single table