Re: Auditing via logical decoding

From: Jeremy Finzel <finzelj(at)gmail(dot)com>
To: Philip Scott <from_postgres(at)safetyphil(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Auditing via logical decoding
Date: 2018-07-27 11:11:03
Message-ID: CAMa1XUh_CryMLLtA1DWZs-LqmWKmK+KsYbQTMUX87szs32dLxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 27, 2018 at 5:41 AM Philip Scott <from_postgres(at)safetyphil(dot)com>
wrote:

> Hi Postgres Hackers,
>
> We have been using our own trigger-based audit system at my firm
> successfully for some years, but the performance penalty is starting to
> grate a bit and so I have been tasked with seeing if we can make use of
> the new logical decoding functions to achieve the same thing. I thought
> that someone must already have written something that would satisfy our
> use-case but my internet searches have come up short so far so I am
> considering writing a logical decoding plugin to do what we want.
>
> I thought I would run the idea past you all here just in case my plan is
> crazy; I’ve browsed around the postgres source code a bit before but
> I’ve never really gotten my hands dirty and am a little bit nervous
> about putting my own C code into the heart of our DBMS so if this comes
> to anything I would like to offer my code up for review and/or possible
> inclusion as a contributed module.
>
> A quick summary of requirements:
>
> We want to log (to a separate, remote database)
> - One row for every transaction that changes the state of the
> database.
> We call this table ‘audit_entry’ and contains the xid, transaction
> timestamp, username, client hostname, and application name of the
> session that caused the change.
> - One row for each change made by each transaction which records the
> state of the tuple before the change.
> We call this table ‘audit_detail’ and contains xid, statement
> timestamp, table name & schema, event_type, primary_key (hstore),
> old_row (hstore), and the text of the query that was responsible for the
> change.
>
> A lot of that information is available already by listening to the
> pgoutput decoding, and my first thought was that I could just write a
> receiver for that. However, application name, username, client hostname
> and current_query() are not available. This is understandable as they
> aren’t useful for logical replication.
>
> I was about to give up, when I discovered pg_logical_emit_message.
>
> My current thoughts are to:
> - Write this extra data into a logical message while the transaction
> is still in progess
> Either with a deferred trigger per table or, perhaps better
> Find some global commit-time (or xid-assigment time) hook emit it
> there
>
> - Then get the information out of the database:
> Either modify the existing pgoutput plugin & protocol to forward
> such messages in its stream,
> Or write a dedicated ‘audit’ decoding plugin with its own protocol
>
> - Then get the information into the ‘auditing’ database:
> Either with some standalone process that connects to both, consumes
> the output created above, translates it to SQL to run in the auditing
> DB.
> Figure out how to create a proper postgres background process to do
> it, in a similar fashion to the logical replication worker
>
> Any input you folks have would be very much appreciated.
>
> Kinds Regards,
>
> Philip
>
> PS: If there is someone out there who is willing & able to build this
> for less than my company will have to pay me to do it, please drop me a
> line ☺

All I can say is +1 this would be an awesome feature to have and I hope to
see it someday.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2018-07-27 11:18:28 My Skype account (korotkovae) was hacked
Previous Message Philip Scott 2018-07-27 10:44:28 Auditing via logical decoding