Re: PostgreSQL Audit Extension

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL Audit Extension
Date: 2016-02-03 16:36:45
Message-ID: CA+TgmobJVantTEdV3-tvtg1C2FuG22w_C5B2ENYC6xFBmNDYfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 3, 2016 at 10:37 AM, David Steele <david(at)pgmasters(dot)net> wrote:
> On 2/1/16 11:23 PM, Robert Haas wrote:
>> OK, I'll bite: I'm worried that this patch will be a maintenance
>> burden. It's easy to imagine that changes to core will result in the
>> necessity or at least desirability of changes to pgaudit, but I'm
>> definitely not prepared to insist that future authors try to insist
>> that future patch submitters have to understand this code and update
>> it as things change.
>
> I agree this is a concern. It's similar to deparse or event triggers in
> this regard with the notable exception that pgaudit is not in core.

I don't see event triggers as having the same issues, actually. DDL
deparse - which I assume is what you mean by deparse - does, and I
complained about that too, quite a lot, and some work was done to
address it - I would have liked more. One of the things I argued for
forcefully with regard to DDL deparse is that it needed to be able to
deparse every DDL command we have, not just the ones the authors
thought were most important. I would expect no less from an auditing
facility.

> However, if it becomes popular enough out of core as everyone insists is
> preferable then people will still need to maintain it. Just as PostGIS
> has a close relationship with core, the pgaudit team would need to have
> the same sort of relationship. Patches would be submitted for review
> and (hopefully) committed and core developer time would still be spent
> on pgaudit, ableit indirectly. Core developers would still have to be
> careful not to break pgaudit if it became popular enough.

This just isn't how it works. I have no idea what the PostGIS folks
are doing, and they generally don't need to know what we're doing.
Occasionally we interact with each other, but mostly those two
different pieces of software can be developed by different people, and
that's a good thing. Migrating PostGIS into PostgreSQL's core would
not be good for either project IMHO. It is neither necessary nor
desirable to have multiple software projects all merged together in a
single git repo.

>> The set of things that the patch can audit is pretty arbitrary and not
>> well tied into the core code.
>
> Since the set of what it can audit is every command that can be run by a
> user in Postgres I don't see how that's arbitrary.

That's not what I'm talking about. You audit relation access and
function calls but not, say, creation of event triggers. Yes, you can
log every statement that comes in, but that's not the secret sauce:
log_statement=all will do that much. The secret sauce is figuring out
the set of events that a statement might perform which might cause
that statement to generate audit records. And it does not seem to me
that what you've got there right now is particularly general - you've
got relation access and function calls and a couple of other things,
but it's far from comprehensive.

>> There is a list of string constants in
>> the code that covers each type of relations plus functions, but not
>> any other kind of SQL object. If somebody adds a new relkind, this
>> would probably need to updated - it would not just work. If somebody
>> adds a new type of SQL object, it won't be covered unless the user
>> takes some explicit action, but there's no obvious guiding principle
>> to say whether that would be appropriate in any particular case.
>
> I think a lot of this could be mitigated by some changes in utility.c.
> I'm planning patches that will allow mapping command strings back to
> event tags and a general classifier function that could incidentally be
> used to improve the granularity of log_statement.

So, *this* starts to smell like a reason for core changes. "I can't
really do what I want in my extension, but with these changes I could"
is an excellent reason to change core.

>> In
>> saying that it's arbitrary, I'm not saying it isn't *useful*. I'm
>> saying there could be five extensions like this that make equally
>> arbitrary decisions about what to do and how to do it, and they could
>> all be useful to different people.
>
> There *could* be five extensions but there are not. To my knowledge
> there are two and one is just a more evolved version of the other.

Right now that may be true, although it wouldn't surprise me very much
to find out that other people have written such extensions and they
just didn't get as much press. Also, consider the future. It is
*possible* that your version of pgaudit will turn out to be the be-all
and the end-all, but it's equally possible that somebody will fork
your version in turn and evolve it some more. I don't see how you can
look at the pgaudit facilities you've got here and say that this is
the last word on auditing and all PostgreSQL users should be content
with exactly that facility. I find that ridiculous. Look me in the
eye and tell me that nobody's going to fork your version and evolve it
a bunch more.

> People who are interested in audit are also understandably leery of
> downloading code from an untrusted source. Both PGXN and GitHub are The
> Wild West as far as conservative auditors are concerned.

I hate to be rude here, but that's not my problem. You can put it on
your corporate web site and let people download it from there. I'm
sure that auditors are familiar with the idea of downloading software
from for-profit companies. Do they really not use any software from
Microsoft or Apple, for example? If the problem is that they will
trust the PostgreSQL open source project but not YOUR company, then I
respectfully suggest that you need to establish the necessary
credibility, not try to piggyback on someone else's.

> I'll be the first to admit that the design is not the prettiest. Trying
> to figure out what Postgres is doing internally through a couple of
> hooks is like trying to replicate the script of a play when all you have
> is the program. However, so far it has been performed well and been
> reliable in field tests.

That's good to hear, but again, it's not enough for a core submission.
Code that goes into our main git repository needs to be "the
prettiest". I mean it's not all perfect of course, but it should be
pretty darn good.

Also, understand this: when you get a core submission accepted, the
core project is then responsible for maintaining that code even if you
disappear. It's entirely reasonable for the project to demand that
this isn't going to be too much work. It's entirely reasonable for
the community to want the design to be very good and the code quality
to be high. It's entirely reasonable for the community NOT to want to
privilege one implementation over another. If you don't agree that
those things are reasonable then we disagree pretty fundamentally on
the role of the community. The community is a group of people to whom
I (or you) can give our time and my (or your) code, not a group of
people who owe me (or you) anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2016-02-03 16:47:12 Re: Batch update of indexes
Previous Message Ashutosh Bapat 2016-02-03 16:17:06 Re: [POC] FETCH limited by bytes.