Re: Auditing extension for PostgreSQL (Take 2)

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: David Steele <david(at)pgmasters(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>
Subject: Re: Auditing extension for PostgreSQL (Take 2)
Date: 2015-02-17 15:34:03
Message-ID: 20150217153402.GQ6717@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David,

I've CC'd Abhijit, the original author of pgaudit, as it seems likely
he'd also be interested in this.

* David Steele (david(at)pgmasters(dot)net) wrote:
> I've posted a couple of messages over the last few weeks about the work
> I've been doing on the pg_audit extension. The lack of response could
> be due to either universal acclaim or complete apathy, but in any case I
> think this is a very important topic so I want to give it another try.

Thanks! It's certainly an important topic to a lot of folks; I imagine
the lack of response is simply because people are busy.

> I've extensively reworked the code that was originally submitted by
> 2ndQuandrant. This is not an indictment of their work, but rather an
> attempt to redress concerns that were expressed by members of the
> community. I've used core functions to determine how audit events
> should be classified and simplified and tightened the code wherever
> possible. I've removed deparse and event triggers and opted for methods
> that rely only on existing hooks. In my last message I provided
> numerous examples of configuration, usage, and output which I hoped
> would alleviate concerns of complexity. I've also written a ton of unit
> tests to make sure that the code works as expected.

The configuration approach you posted is definitely in-line with what I
was trying to get at previously- thanks for putting some actual code
behind it! While not a big fan of that other big RDBMS, I do like that
this approach ends up being so similar in syntax.

> I realize this is not an ideal solution. Everybody (including me) wants
> something that is in core with real policies and more options. It's
> something that I am personally really eager to work on. But the reality
> is that's not going to happen for 9.5 and probably not for 9.6 either.
> Meanwhile, I believe the lack of some form of auditing is harming
> adoption of PostgreSQL, especially in the financial and government sectors.

Agreed.

> The patch I've attached satisfies the requirements that I've had from
> customers in the past. I'm confident that if it gets out into the wild
> it will bring all kinds of criticism and comments which will be valuable
> in designing a robust in-core solution.

This is definitely something that makes sense to me, particularly for
such an important piece. I had argued previously that a contrib based
solution would make it difficult to build an in-core solution, but
others convinced me that it'd not only be possible but would probably be
preferrable as we'd gain experience with the contrib module and, as you
say, we'd be able to build a better in-core solution based on that
experience.

> I'm submitting this patch to the Commitfest. I'll do everything I can
> to address the concerns of the community and I'm happy to provide more
> examples as needed. I'm hoping the sgml docs I've provided with the
> patch will cover any questions, but of course feedback is always
> appreciated.

Glad you submitted it to the CommitFest. Just glancing through the
code, it certainly looks a lot closer to being something which we could
move forward with. Using existing functions to work out the categories
(instead of massive switch statements) is certainly much cleaner and
removing those large #ifdef blocks has made the code a lot easier to
follow.

Lastly, I really like all the unit tests..

Additional comments in-line follow.

> diff --git a/contrib/pg_audit/pg_audit.c b/contrib/pg_audit/pg_audit.c
> new file mode 100644
> index 0000000..b3914ac
> --- /dev/null
> +++ b/contrib/pg_audit/pg_audit.c
> @@ -0,0 +1,1099 @@
> +/*------------------------------------------------------------------------------
> + * pg_audit.c
> + *
> + * An auditing extension for PostgreSQL. Improves on standard statement logging
> + * by adding more logging classes, object level logging, and providing
> + * fully-qualified object names for all DML and many DDL statements.

It'd be good to quantify what 'many' means above.

> + * Copyright (c) 2014-2015, PostgreSQL Global Development Group
> + *
> + * IDENTIFICATION
> + * contrib/pg_prewarm/pg_prewarm.c

Pretty sure this isn't pg_prewarm.c :)

> +/*
> + * String contants for audit types - used when logging to distinguish session
> + * vs. object auditing.
> + */

"String constants"

> +/*
> + * String contants for log classes - used when processing tokens in the
> + * pgaudit.log GUC.
> + */

Ditto.

> +/* String contants for logging commands */

Ditto. :)

> +/*
> + * This module collects AuditEvents from various sources (event triggers, and
> + * executor/utility hooks) and passes them to the log_audit_event() function.

This isn't using event triggers any more, right? Doesn't look like it.
I don't think that's a problem and it certainly seems to have simplified
things quite a bit, but the comment should be updated.

> +/*
> + * Returns the oid of the role specified in pgaudit.role.
> + */
> +static Oid
> +audit_role_oid()

Couldn't you use get_role_oid() instead of having your own function..?

> + /*
> + * Check privileges granted indirectly via role memberships. We do this in
> + * a separate pass to minimize expensive indirect membership tests. In
> + * particular, it's worth testing whether a given ACL entry grants any
> + * privileges still of interest before we perform the has_privs_of_role
> + * test.
> + */

I'm a bit on the fence about this.. Can you provide a use-case where
doing this makes sense..? Does this mean I could grant admin_role1 to
audit and then get auditing on everything that user1 has access to?
That seems like it might be useful for environments where such roles
already exist though it might end up covering more than is intended..

> + /* Make a copy of the column set */
> + tmpSet = bms_copy(attributeSet);

The comment should probably say why, eg:

/* bms_first_member is destructive, so make a copy before using it. */

> +/*
> + * Define GUC variables and install hooks upon module load.
> + */
> +void
> +_PG_init(void)
> +{
> + if (IsUnderPostmaster)
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("pgaudit must be loaded via shared_preload_libraries")));
> +
> + /*
> + * pgaudit.role = "role1"

I'd make that example "audit", since that's what is generally expected,
right?

> +################################################################################
> +# test.pl - pgAudit Unit Tests
> +################################################################################

This is definitiely nice..

> diff --git a/doc/src/sgml/pgaudit.sgml b/doc/src/sgml/pgaudit.sgml
> new file mode 100644
> index 0000000..f3f4ab9
> --- /dev/null
> +++ b/doc/src/sgml/pgaudit.sgml
> @@ -0,0 +1,316 @@
> +<!-- doc/src/sgml/pgaudit.sgml -->
> +
> +<sect1 id="pgaudit" xreflabel="pgaudit">
> + <title>pg_audit</title>

There seems to be a number of places which are 'pgaudit' and a bunch
that are 'pg_audit'. I'm guessing you were thinking 'pg_audit', but
it'd be good to clean up and make them all consistent.

> + <indexterm zone="pgaudit">
> + <primary>pg_audit</primary>
> + </indexterm>
> +
> + <para>
> + The <filename>pg_audit</filename> module provides session and object
> + auditing via the standard logging facility. Session and object auditing are
> + completely independent and can be combined.
> + </para>
> +
> + <sect2>
> + <title>Session Auditing</title>
> +
> + <para>
> + Session auditing allows the logging of all commands that are executed by
> + a user in the backend. Each command is logged with a single entry and
> + includes the audit type (e.g. <literal>SESSION</literal>), command type
> + (e.g. <literal>CREATE TABLE</literal>, <literal>SELECT</literal>) and
> + statement (e.g. <literal>"select * from test"</literal>).
> +
> + Fully-qualified names and object types will be logged for
> + <literal>CREATE</literal>, <literal>UPDATE</literal>, and
> + <literal>DROP</literal> commands on <literal>TABLE</literal>,
> + <literal>MATVIEW</literal>, <literal>VIEW</literal>,
> + <literal>INDEX</literal>, <literal>FOREIGN TABLE</literal>,
> + <literal>COMPOSITE TYPE</literal>, <literal>INDEX</literal>, and
> + <literal>SEQUENCE</literal> objects as well as function calls.
> + </para>

Ah, you do have a list of what objects you get fully qualified names
for, nice. Are there obvious omissions from that list..? If so, we
might be able to change what happens with objectAccess to include them..

> + <para>
> + Enable session logging for all commands except miscellaneous:
> + <programlisting>
> +pgaudit.log = 'all, -misc'
> + </programlisting>
> + </para>
> + </sect3>

Nice, I like that.

> + <sect3>
> + <title>Examples</title>
> +
> + <para>
> + Set <literal>pgaudit.log = 'read, ddl'</literal> in
> + <literal>postgresql.conf</literal>.
> + </para>

Perhaps I missed it, but it'd be good to point out that GUCs can be set
at various levels. I know we probably say that somewhere else, but it's
particularly relevant for this.

That was just a quick read. Would be great if someone else who is
interested would take it for a spin and provide their own feedback.

Thanks again!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2015-02-17 15:38:54 Re: Add min and max execute statement time in pg_stat_statement
Previous Message Robert Haas 2015-02-17 15:32:42 Re: pg_check_dir comments and implementation mismatch