Re: C based plugins, clocks, locks, and configuration variables

From: Clifford Hammerschmidt <tanglebones(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Craig Ringer <craig(dot)ringer(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: C based plugins, clocks, locks, and configuration variables
Date: 2016-11-08 18:48:40
Message-ID: CANvN6gx=7cosvFJsfurV_br0EkYOUDE4na4sSZPZwOBRQuu8ew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Looking closer at the bit math, I screwed it up.... it should be 64 bits
time, 6 bit uuid version, 8 node, 8 seq, and the rest random ... which is
42 bits of random. I'll find the code in a bit.

--
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 9:42 AM, Clifford Hammerschmidt <
tanglebones(at)gmail(dot)com> wrote:

> Hi Jim,
>
> The values are still globally unique. The odds of a collision are very
> very low. Two instances with the same node_id generating on the same
> millisecond (in their local view of time) have a 1:2^34 chance of
> collision. node_id only repeats every 256 machines in a cluster (assuming
> you're configured correctly), and the probability of the same millisecond
> being used on both machines is also low (depends on generation rate and
> machine speed). The only real concern is with clock replays (i.e. something
> sets the clock backwards, like an admin or a badly implemented time sync
> system), which does happen in rare instances and is why seq is there to
> extend that space out and reduce the chance of a collision in that
> millisecond. (time replays are a real problem with id systems like
> snowflake.)
>
> Also, the point of the timestamp isn't uniqueness, it's the generally
> monotonically ascending aspect I want. This causes inserts to append to the
> index (much faster than random inserts in large indexes because of cache
> coherency), and causes data generated around the same time to occupy near
> nodes in the index (again, cache benefits, as related data tends to be
> generated bunched up in time).
>
> Thanks,
> -Cliff.
>
> --
> Clifford Hammerschmidt, P.Eng.
>
> On Tue, Nov 8, 2016 at 6:27 AM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
> wrote:
>
>> On 11/3/16 7:14 PM, Craig Ringer wrote:
>>
>>> 1) getting microseconds (or nanoseconds) from UTC epoch in a plugin
>>>>
>>>
>>> GetCurrentIntegerTimestamp()
>>>
>>
>> Since you're serializing generation anyway you might want to just forgo
>> the timestamp completely. It's not like the values your generating are
>> globally unique anymore, or hard to guess.
>> --
>> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
>> Experts in Analytics, Data Architecture and PostgreSQL
>> Data in Trouble? Get it in Treble! http://BlueTreble.com
>> 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
>>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2016-11-08 18:51:28 Re: Logical Replication WIP
Previous Message Andreas Karlsson 2016-11-08 18:48:33 Re: [PATCH] Reload SSL certificates on SIGHUP