From: | Philip Scott <from_postgres(at)safetyphil(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Auditing via logical decoding |
Date: | 2018-07-27 10:44:28 |
Message-ID: | 40d17f66c2d979f27c5431a04f246434@safetyphil.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Postgres Hackers,
We have been using our own trigger-based audit system at my firm
successfully for some years, but the performance penalty is starting to
grate a bit and so I have been tasked with seeing if we can make use of
the new logical decoding functions to achieve the same thing. I thought
that someone must already have written something that would satisfy our
use-case but my internet searches have come up short so far so I am
considering writing a logical decoding plugin to do what we want.
I thought I would run the idea past you all here just in case my plan is
crazy; I’ve browsed around the postgres source code a bit before but
I’ve never really gotten my hands dirty and am a little bit nervous
about putting my own C code into the heart of our DBMS so if this comes
to anything I would like to offer my code up for review and/or possible
inclusion as a contributed module.
A quick summary of requirements:
We want to log (to a separate, remote database)
- One row for every transaction that changes the state of the
database.
We call this table ‘audit_entry’ and contains the xid, transaction
timestamp, username, client hostname, and application name of the
session that caused the change.
- One row for each change made by each transaction which records the
state of the tuple before the change.
We call this table ‘audit_detail’ and contains xid, statement
timestamp, table name & schema, event_type, primary_key (hstore),
old_row (hstore), and the text of the query that was responsible for the
change.
A lot of that information is available already by listening to the
pgoutput decoding, and my first thought was that I could just write a
receiver for that. However, application name, username, client hostname
and current_query() are not available. This is understandable as they
aren’t useful for logical replication.
I was about to give up, when I discovered pg_logical_emit_message.
My current thoughts are to:
- Write this extra data into a logical message while the transaction
is still in progess
Either with a deferred trigger per table or, perhaps better
Find some global commit-time (or xid-assigment time) hook emit it there
- Then get the information out of the database:
Either modify the existing pgoutput plugin & protocol to forward
such messages in its stream,
Or write a dedicated ‘audit’ decoding plugin with its own protocol
- Then get the information into the ‘auditing’ database:
Either with some standalone process that connects to both, consumes
the output created above, translates it to SQL to run in the auditing
DB.
Figure out how to create a proper postgres background process to do
it, in a similar fashion to the logical replication worker
Any input you folks have would be very much appreciated.
Kinds Regards,
Philip
PS: If there is someone out there who is willing & able to build this
for less than my company will have to pay me to do it, please drop me a
line ☺
From | Date | Subject | |
---|---|---|---|
Next Message | Jeremy Finzel | 2018-07-27 11:11:03 | Re: Auditing via logical decoding |
Previous Message | Ashutosh Bapat | 2018-07-27 08:49:27 | Re: [HACKERS] [PATCH] Overestimated filter cost and its mitigation |