Re: Auditing extension for PostgreSQL (Take 2)

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
Subject: Re: Auditing extension for PostgreSQL (Take 2)
Date: 2015-05-07 19:41:13
Message-ID: 20150507194113.GC30322@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce,

* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> On Thu, May 7, 2015 at 10:26:55AM -0400, Stephen Frost wrote:
> > * Peter Eisentraut (peter_e(at)gmx(dot)net) wrote:
> > > On 5/4/15 8:37 PM, Stephen Frost wrote:
> > > > I don't follow this logic. The concerns raised above are about changing
> > > > our in-core logging. We haven't got in-core auditing and so I don't see
> > > > how they apply to it.
> > >
> > > How is session "auditing" substantially different from statement logging?
> >
> > David had previously outlined the technical differences between the
> > statement logging we have today and what pgAudit does, but I gather
> > you're asking about it definitionally, though it ends up amounting to
> > much the same to me. Auditing is about "what happened" whereas
> > statement logging is "log whatever statement the user sent." pgAudit
> > bears this out by logging internal SQL statements and object
> > information, unlike what we do with statement logging today.
>
> Well, what I was looking for is how auditing is _conceptually_ different
> from logging, e.g. I can clearly explain how authentication (prove who
> you are) and authorization (what you are allowed to do) are different.

I'd say that auditing is one category of logging, just as
statement/session logging is another category of logging. What we
currently have in core is well defined and used extensively- but it's a
specific kind of logging which is statement logging, and that's all we
provide currently. That isn't to say that there isn't commonality,
there certainly is, but pgAudit leverages that to a large extent already
by using the logging infrastructure in the backend for everything which
is logged.

> Your definition above seems to be more behavioral, e.g. what arrived vs.
> what happened. It is not clear to me why reporting such information is
> conceptually different and requires different infrastructure, i.e. we
> could not easily combine authentication and authorization into the same
> infrastructure, but logging and auditing seems similar.

Similar to logging, Authentication and Authorization also fall under a
more general category- access control.

> > > At least no one has disputed that yet. The only argument against has
> > > been that they don't want to touch the logging.
> >
> > I'm afraid we've been talking past each other here- I'm fully on-board
> > with enhancing our in-core logging capabilities and even looking to the
> > future at having object auditing included in core. It's not my intent
> > to dispute that or to argue against it.
> >
> > Perhaps I've misunderstood the thrust of this sub-thread, so let me
> > explain what I thought the discussion was. My understanding was that
> > you were concerned about having session auditing included in pgAudit
> > and, further, that you wanted to see our in-core statement logging be
> > improved. I agree that we want to improve the in-core statement logging
> > and, ideally, have an in-core auditing solution in the future. I was
> > attempting to address the concern about having session logging in
> > pgAudit by pointing out that it's valuable to have even if our in-core
> > statement logging is augmented, and further, having it in pgAudit does
> > not preclude or reduce our ability to improve the in-core statement
> > logging in the future; indeed, it's my hope that we'll get good feedback
> > from users of pgAudit which could guide our in-core implementation. As
>
> What is our history of doing things in contrib because we are not sure
> what we want, then moving it into core? My general recollection is that
> there is usually something in the contrib version we don't want to add
> to core and people are locked into the contrib API, so we are left
> supporting it, e.g. xml2, though you could argue that auditing doesn't
> have application lock-in and xml2 was tied to an external library
> feature.

That's exactly the argument that I'd make there. My recollection is
that we did move pieces of hstore and have moved pieces of other contrib
modules into core; perhaps we've not yet had a case where we've
completely pulled one in, but given the relatively low level of
dependency associated with pgAudit, I'm certainly hopeful that we'll be
able to here. Lack of history which could be pointed to that's exactly
what I'm suggesting here doesn't seem like a reason to not move forward
here though; the concept of having a capability initially in contrib and
then bringing it into core has certainly been discussed a number of
times on other threads and generally makes sense, at least to me,
especially when there's little API associated with the extension.

> > for the concern that pgAudit may end up "rotting" in the tree as some
> > other contrib modules have, I can say with confidence that we will have
> > users of it just as soon as they're able to move to a version of PG
> > which includes it and therefore will be supporting it and addressing
> > issues as we discover them, as I suspect the others who have been
>
> Uh, why are they not using the PGXN version of pg_audit, and if it is
> because it isn't shipped with Postgres, then these seem like unmotivated
> users who will complain for some reason if we ever move it out of
> contrib.

Put bluntly, but I believe accurately, there's very few organizations
who have auditing requirements who want anything to do with PGXN.
That's an entirely understandable and defensible position, as there's
very little control in that environment (intentionally so, which is what
makes it great for some users but not acceptable for others). I'd
hardly call those users "unmotivated" considering that they've put forth
substantial resources towards pgAudit in the form of the EU grant which
started it over a year ago and the further efforts being made to bring
it to PG.

> I guess the over-arching question is whether we have to put this into
> contrib so we can get feedback and change the API, or whether using from
> PGXN or incrementally adding it to core is the right approach.

I'm surprised to hear this question of if we "have to" do X, Y, or Z.
pgAudit brings a fantastic capability to PostgreSQL which users have
been asking to have for many years and is a feature we should be itching
to have included. That we can then take it and incrementally add it to
core, to leverage things which are only available in core (as discussed
last summer, including grammar and relation metadata), looks to me like
a great direction to go in and one which we could use over and over to
bring new features and capabilities to PG.

Lack of auditing is one of the capabilities that users coming from other
large RDBMS's see as preventing their ability to migrate to PostgreSQL.
Other databases (open and closed source) have it and have had it for
years and it's a serious shortcoming of ours that makes users either
stick with their existing vendor or look to other closed-source or even
open-source solutions.

> > involved in this discussion will be also. Additionally, as discussed
> > last summer, we can provide a migration path (which does not need to be
> > automated or even feature compatible) from pgAudit to an in-core
> > solution and then sunset pgAudit.
>
> Uh, that usually ends badly too.

I'm confused by this, as it was the result of our discussion and your
suggestion from last summer: 20140730192136(dot)GM2791(at)momjian(dot)us

I certainly hope that hasn't substantially changed as that entire
discussion is why we're even able to have this discussion about
including pgAudit now. I was very much on-board with trying to work on
an in-core solution until that thread convinced me that the upgrade
concerns which I was worried about wouldn't be an issue for inclusion of
an extension to provide the capability.

> > Building an in-core solution, in my estimation at least, is going to
> > require at least a couple of release cycles and having the feedback from
> > users of pgAudit will be very valuable to building a good solution, but
> > I don't believe we'll get that feedback without including it.
>
> See above --- is it jump through the user hoops and only then they will
> use it and give us feedback? How motivated can they be if they can't
> use the PGXN version?

Why wouldn't we want to include this capability in PG? I also addressed
the "why not PGXN" above. It it not a lack of motivation but the entire
intent and design of the PGXN system which precludes most large
organizations from using it, particularly for sensitive requirements
such as auditing.

> The bottom line is that for the _years_ we ship pg_audit in /contrib, we
> will have some logging stuff in postgresql.conf and some in
> contrib/pg_audit and that distinction is going to look quite odd. To
> the extent you incrementally add to core, you will have duplicate
> functionality in both places.

That's entirely correct, of course, but I'm not seeing it as an issue.
I'm certainly prepared to support shipping pgAudit in contrib, as are
others based on how this feature has been developed, for the years that
we'll have 9.5, 9.6 (or 10.0, etc) supported- and that's also another
reason why users will use it when they wouldn't use something on PGXN.

Further, I look forward to working incrementally to bring similar
capability into core, but I suspect those increments will largely be in
the infrastructure until we reach the point where we're able to provide
the user-facing bits, which is quite likely to go in all at once and
allow us a clear upgrade path from one to the other. Perhaps that's
optimistic, but we do tend to try and bring things in as whole
capabilities rather than bits and pieces and I don't expect us to need
to do it differently here.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Euler Taveira 2015-05-07 19:48:07 Re: initdb start server recommendation
Previous Message Fabien COELHO 2015-05-07 19:17:04 commitfest app bug/feature