Re: Audit Logs WAS: temporal support patch

From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Jim Nasby <jim(at)nasby(dot)net>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Vlad Arkhipov <arhipov(at)dc(dot)baikal(dot)ru>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Audit Logs WAS: temporal support patch
Date: 2012-08-28 21:30:53
Message-ID: CAFNqd5VvgJKgtBP0hU9Tb0sM0qr+z7+FTLdrb2xLHweeErdeKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 28, 2012 at 5:06 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
> On 8/28/12 2:51 PM, Pavel Stehule wrote:
>>>
>>> >The thing I don't like about this is it assumes that time is the best
>>> > way to
>>> >refer to when things changed in a system. Not only is that a bad
>>> > assumption,
>>> >it also means that relating things to history becomes messy.
>>
>> On second hand I don't have a problem with some optional counter,
>> although I think so database system time is very useful and other
>> counters for versioning are not necessary - because in one time I can
>> have only one version - it doesn't do versions from rollbacked
>> transactions.
>
> What happens if the system clock runs backwards?
>
> What happens if two transactions start in the same microsecond? (And I know
> for a fact that's possible, because I've seen it).
>
> More importantly, I believe using time to handle recording a versioned
> history of something is flawed to begin with. You might care about what time
> a new version was created; but what's far more important is recording the
> correct ordering of things, and time isn't actually a great way to do that.
>
> Note that no version control systems use time as their primary attribute.

At one point, I designed a configuration system that used the DNS
concept of "serial numbers" to indicate the temporality of the
configuration. NULL was treated as 'infinitely in the future'.

The notion would be that a new piece of configuration would initially
be assigned (latest_serial+1, NULL) as its temporal visibility, and,
upon bumping the serial number, that would become "live"
configuration.

One would mark a piece of configuration as "about to die" by assigning
(X, latest_serial+1) as its temporal visibility, and, again, upon
bumping the serial number, that terminates its visibility.

In that approach to things, it's an "expensive serialization event" to
bump the serial number. Not that the action is super-expensive, it's
just that that's not something you ought to do in a distributed
fashion. ONE process should bump the serial number.

I think you're right that for these sorts of cases, the use of time as
the source of versioning information is pretty dangerous.

In SCM systems, we discovered that it was pretty horrible to try to
assign serial numbers in a 'consistent' fashion; you'd get, in CVS,
that files would get assigned version '1.2.1.5.3.2.7', and things
would get worse from there. It turned out that using a hash code like
SHA-* as a version number was more satisfactory; just attach labels to
those versions that you care to identify.

It's possible that, for a "versioned history," that assigning a
hash/UUID as the version ID is pretty satisfactory, even though it's
not ordered. Using a sequence to assign a version scales a bit less
well, but is probably usually OK.

One of the other things discovered in that "config system design" was
that it was mighty useful to assign transactions at will:
create table config_txns (
tx_id serial primary key,
tx_at timestamptz default now()
);
, associating data with the tx_id value, and then associate additional
information to the "config_txns" table if needful.

For instance, sometimes you need a label (akin to an SCM "branch" or
"tag" label)
create table config_labels (
tx_id integer primary key references config_txns(tx_id) on delete cascade,
label text not null unique
);
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2012-08-29 01:09:40 Re: "default deny" for roles
Previous Message Tom Lane 2012-08-28 21:27:51 Re: splitting htup.h