Re: C based plugins, clocks, locks, and configuration variables

From: Clifford Hammerschmidt <tanglebones(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Craig Ringer <craig(dot)ringer(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: C based plugins, clocks, locks, and configuration variables
Date: 2016-11-08 17:42:16
Message-ID: CANvN6gzMoD3M3k3fX1fbaG8KJb+caLk1yZOkqRzvumyYO1hz5A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Jim,

The values are still globally unique. The odds of a collision are very very
low. Two instances with the same node_id generating on the same millisecond
(in their local view of time) have a 1:2^34 chance of collision. node_id
only repeats every 256 machines in a cluster (assuming you're configured
correctly), and the probability of the same millisecond being used on both
machines is also low (depends on generation rate and machine speed). The
only real concern is with clock replays (i.e. something sets the clock
backwards, like an admin or a badly implemented time sync system), which
does happen in rare instances and is why seq is there to extend that space
out and reduce the chance of a collision in that millisecond. (time replays
are a real problem with id systems like snowflake.)

Also, the point of the timestamp isn't uniqueness, it's the generally
monotonically ascending aspect I want. This causes inserts to append to the
index (much faster than random inserts in large indexes because of cache
coherency), and causes data generated around the same time to occupy near
nodes in the index (again, cache benefits, as related data tends to be
generated bunched up in time).

Thanks,
-Cliff.

--
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 6:27 AM, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com> wrote:

> On 11/3/16 7:14 PM, Craig Ringer wrote:
>
>> 1) getting microseconds (or nanoseconds) from UTC epoch in a plugin
>>>
>>
>> GetCurrentIntegerTimestamp()
>>
>
> Since you're serializing generation anyway you might want to just forgo
> the timestamp completely. It's not like the values your generating are
> globally unique anymore, or hard to guess.
> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532) mobile: 512-569-9461
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2016-11-08 17:45:18 Make pg_basebackup -x stream the default
Previous Message Jeff Janes 2016-11-08 17:26:41 Re: Write Ahead Logging for Hash Indexes