Re: Auditing extension for PostgreSQL (Take 2)

From: David Steele <david(at)pgmasters(dot)net>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Simon Riggs <simon(at)2ndQuadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Auditing extension for PostgreSQL (Take 2)
Date: 2015-04-07 00:24:18
Message-ID: 552323B2.8060708@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

On 4/6/15 4:34 PM, Peter Eisentraut wrote:
> On 2/14/15 9:34 PM, David Steele wrote:
>> The patch I've attached satisfies the requirements that I've had
>> from customers in the past.
>
> What I'm missing is a more precise description/documentation of
> what those requirements might be.

Admittedly I'm not a financial or ISO certification auditor, but I've
been in the position of providing data to auditors on many of
occasions. The requests generally fall into three categories:

1) Data requests. Perhaps all the CDRs for a particular customer for
a particular month. Bulk data requests are not addressed by pg_audit.

2) DDL log. A list of all DDL changes made to the database. For
instance, the last time a function was updated and who did it. The
auditor would like to be sure that the function update timestamp
matches up with the last maintenance window and the person who is on
record as having done the updates.

3) DML log. This can be done with triggers, but requires quite a bit
of work and vigilance.

> "Audit" is a "big word". It might imply regulatory or standards
> compliance on some level. We already have ways to log everything.
> If customers want "auditing" instead, they will hopefully have a
> precise requirements set, and we need a way to map that to a
> system configuration. I don't think "we need auditing" -> "oh
> there's this pg_audit thing, and it has a bunch of settings you can
> play with" is going to be enough of a workflow. For starters, I
> would consider the textual server log to be potentially lossy in
> many circumstances, so there would need to be additional
> information on how to configure that to be robust.

Nothing is perfect, but there's a big difference between being able to
log everything and being able to use the data you logged to satisfy an
audit. Auditors tend to be reasonably tech savvy but there are
limits. An example of how pg_audit can provide better logging is at
the end of this email.

I agree that server logs are potentially lossy but that really
describes anywhere audit records might be written. Agreed that there
are better ways to do it (like writing back to the DB, or a remote
DB), but I thought those methods should be saved for a future version.

In my past experience having retention policies in place and being
able to show that they normally work are enough to satisfy an auditor.
Accidents happen and that's understood - as long as an explanation
for the failure is given. Such as, "We lost a backup tape, here's the
ticket for the incident and the name of the employee who handled the
case so you can follow up." Or, "On this date we had a disk failure
and lost the logs before the could be shipped, here's the ticket, etc."

> (Also, more accurately, this is an "audit trail", not an "audit".
> An audit is an examination of a system, not a record of
> interactions with a system. An audit trail might be useful for an
> audit.)

You are correct and I'd be happy to call it pg_audit_trail (as Simon
suggested) or pg_audit_log or something that's more descriptive.

> I see value in what you call object auditing, which is something
> you can't easily do at the moment. But what you call session
> auditing seems hardly distinct from statement logging. If we
> enhance log_statements a little bit, there will not be a need for
> an extra module to do almost the same thing.

Even with session auditing you can have multiple log entries per
backend call. This is particularly true for DO blocks and functions
calls.

Here's a relatively simple example, but it shows how complex this
could get. Let's say we have a DO block with dynamic SQL:

do $$
declare
table_name text = 'do_table';
begin
execute 'create table ' || table_name || ' ("weird name" int)';
execute 'drop table ' || table_name;
end; $$

Setting log_statement=all will certain work but you only get the DO
block logged:

LOG: statement: do $$
declare
table_name text = 'do_table';
begin
execute 'create table ' || table_name || ' ("weird name" int)';
execute 'drop table ' || table_name;
end; $$

With pg_audit you get (forgive the LFs that email added) much more:

LOG: AUDIT: SESSION,38,1,FUNCTION,DO,,,"do $$
declare
table_name text = 'do_table';
begin
execute 'create table ' || table_name || ' (""weird name"" int)';
execute 'drop table ' || table_name;
end; $$"
LOG: AUDIT: SESSION,38,2,DDL,CREATE
TABLE,TABLE,public.do_table,"CREATE TABLE public.do_table (""weird
name"" pg_catalog.int4 ) WITH (oids=OFF) "
LOG: AUDIT: SESSION,38,3,DDL,DROP TABLE,TABLE,public.do_table,drop
table do_table

Not only is the DO block logged but each sub statement is logged as
well. They are logically grouped by the statement ID (in this case
38) so it's clear they were run as a single command. The commands
(DO, DROP TABLE, CREATE TABLE) and fully-qualified object names are
provided and the statements are quoted and escaped when needed to
making parsing easier.

There's no question that this sort of thing could be done with
log_statements, but I would argue it's more than a little enhancement.

- --
- - David Steele
david(at)pgmasters(dot)net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQIcBAEBCAAGBQJVIyOyAAoJEIA2uIJQ5SFAOQEP/jcEsrAqxxGMn/Px6YSjzJCq
vKBGkilxNHbHn8GeAD617LPHl+4WjgmSWPA4OC2qbCa36tib0mBTRVpdaA1J9PTU
+Ml9kk9hHGdXTkoK2DlMFwVwJ4mZCvKXU4TOOYjdG6YkQaEoCdpEQ8n0Z0bxYogb
zBZ6GnxdkMzD8w33LByW9tf/ShWZsDKh47vqIhk1oGvQULlTGZ7CvAq793vWOCng
9+SBsct8BCUNRS0i1JcWjoLin9rJUNXLkyufIylKuAjbacBDIvQfRmKJJYTQA8lg
7K0Hy5gp7JNWTN+J6TQHM930FFFetVzXXaLRaJwZls9hqzPDSpXA2LleEQy9jzlf
CvSQgoAx/kkBEOjkKBAEL4PYcWXWhizysXkVAURwZ3huvm5wi8C2mVFilFz9oiZa
Z7L0FClFcBhX3ZuptDJOXF4WFdwE7TQDy3Go8aA/UY5gqe08Hqx/Atw881kBEC3j
uQIgdWY0WGVvT43igX44mUv3Q0aTNWHn/jTgaRURwPlpP+wViK3VIybIqiKtebq1
Iaqduge0pirDQMTDdFxt+F5C+ylK+R9TU9xPv8eQwrbq8o3ZIuoiLhtuzl8yQWMI
tiJJCfay4gkm+xZIjsFe9aj3Q0Xk4VjAt8MF9OunaNFTI4X5ZH3OSkZvmbbMjd1S
i6u19Khnj9ryje2nGNFS
=1jDh
-----END PGP SIGNATURE-----

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrízio de Royes Mello 2015-04-07 00:32:34 Re: Proposal : REINDEX xxx VERBOSE
Previous Message Jim Nasby 2015-04-06 22:53:27 Re: Freeze avoidance of very large table.