Re: temporal support patch

From: "David Johnston" <polobo(at)yahoo(dot)com>
To: "'Robert Haas'" <robertmhaas(at)gmail(dot)com>, "'Jeff Davis'" <pgsql(at)j-davis(dot)com>
Cc: "'Vlad Arkhipov'" <arhipov(at)dc(dot)baikal(dot)ru>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: temporal support patch
Date: 2012-08-20 23:17:59
Message-ID: 023f01cd7f2a$0a2e4190$1e8ac4b0$@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-
> owner(at)postgresql(dot)org] On Behalf Of Robert Haas
> Sent: Monday, August 20, 2012 5:04 PM
> To: Jeff Davis
> Cc: Vlad Arkhipov; pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] temporal support patch
>
> On Sun, Aug 19, 2012 at 6:28 PM, Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> >> The other issue is how to handle multiple changes of the same record
> >> within the transaction. Should they be stored or not?
> >
> > In a typical audit log, I don't see any reason to. The internals of a
> > transaction should be implementation details; invisible to the
> > outside, right?
>
> I'm not convinced.

Ideally the decision of whether to do so could be a client decision. Not
storing intra-transaction changes is easier than storing all changes. At
worse you could stage up all changed then simply fail to store all
intermediate results within a given relation. It that case you gain nothing
in execution performance but safe both storage and interpretative resources.
So the question becomes is it worth doing without the ability to store
intermediate results? If you were to ponder both which setup would the
default be? If the default is the harder one (all statements) to implement
then to avoid upgrade issues the syntax should specify that it is logging
transactions only.

Random, somewhat related, thought: I do all my working on a temporary
staging table and then, as my final action, insert the resultant records
onto a separate live table and drop the temporary table. Further changes to
said record I perform by deleting the original then inserting a new record
(from staging again) with all the values changed. Obviously this has
limitations with respect to foreign keys and such but it is possible. What
happens to the audit log if the PK changes and if it does not change? Any
other implications that need to be address or is it like giving a loaded gun
to someone and trust them to use is responsibily?

>
> >> I'm not sure that the database user is the proper thing to be stored
> >> in the history table. Many applications usually connect to a database
> >> using some virtual user and have their own users/roles tables to
> >> handle with privileges. There should be some way to substitute the
> >> stored user in the history table with the application's one. It's
> >> also helpful to store transaction id that inserted/updated/deleted the
> record.
> >
> > If the system is recording it for audit purposes, then it better be
> > sure that it's true. You can't allow the application to pick and
> > choose what gets stored there.
>
> That position would render this feature useless for every application for
> which I would otherwise have used it. I think it's just nonsense to talk
about
> what we can or can't let the user do. The user is in charge, and our job
is to
> allow him to do what he wants to do more easily, not to dictate what he
must
> do.
>
> --

I see the "user" element as having two components:

"Client" - what device/channel/"user" was used to connect to the database -
PostgreSQL Role
"User" - relative to that "client" which actual "user" performed the action
- Application Specified

A PostgreSQL role would correspond to "client" whereas the application would
be allowed to have full control of what "User" value is stored.

This gets a little complicated with respect to "SET ROLE" but gets close to
the truth. The idea is that you look at the "client" to determine the
"namespace" over which the "user" is defined and identified.

So, a better way to phrase the position is that:

"You cannot allow the application to choose what is stored to identify
itself (client)" - i.e., its credentials identify who it is and those are
stored without consulting the application

At that point you've basically shifted responsibility for the correctness of
the audit log onto that application and away from the database. However,
you do provide a place for the application to store an identifier that it is
able to resolve to a user if necessary.

This is an arbitrary two-layer hierarchy and while conceptually anything
with two layers may want more I am not sure whether the extra complexity
that would entail would be worth the effort. Depending on what kinds of
information you allowed to be stored for "User" it becomes something that
can be modeled when desired and ignored otherwise.

The issue with adding the PostgreSQL role to the database in this way is
that you now can never delete that role or reassign it to another entity. I
guess with temporal you could do so and basically have the identity-role
relationship define over specific periods of time... I can (have) imagine a
whole level of indirection and association to be able to reasonably handle
assigning and storing permanent identities while allowing logon credentials
to remain outside of permanent storage.

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-08-20 23:19:03 Re: PATCH: psql boolean display
Previous Message Tatsuo Ishii 2012-08-20 23:09:35 Re: PATCH: psql boolean display