pg_stat_*_columns?

From: Joel Jacobson <joel(at)trustly(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: pg_stat_*_columns?
Date: 2015-06-05 11:51:52
Message-ID: CAASwCXfGwXCRXthoT5D50NhZqTq=RrTQQmdQ3MGZL2Y1=D5yPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Would others find it useful to see per column statistics about accesses to
specific columns?

Two possible use-cases: (maybe there are more?)

1. I think it would be helpful for DBAs to better understand their own
system.
Finding unused *tables* is today easy thanks to pg_stat_*_tables, but
knowing if something is accessing a *column* or not is not easy.
In my case all our database access is via sprocs, so I can just grep the
source code for the column name to see if something is using it, but most
DBAs probably don't have that luxury.

2. It could also be useful for audit trailing, if you want to know what a
query did, i.e. what tables/columns were accessed in the txn.

Here is an idea of a very simple audit trailing system that would probably
fulfill my own needs:

Imagine if we had pg_stat_xact_user_columns and for each committed txn, do
an insert to an unlogged table with the same structure as
pg_stat_xact_user_columns
with the addition of session_user and timestamp for the txn.

I would much rather have audit trailing in a nice table than in a text
file. Maybe a foreign data wrapper could be used to ship the audit trail
data to some other external append-only pg-database, if the purpose of the
audit trailing is to prevent an evil DBA from doing evil things. But for
others it might be sufficient to do audit trailing to the same database,
for convenience purposes.

In summary:

Some users might only be interested in the statistics and mostly use
pg_stat_user_columns.
Other others might also be interested in what happened in a specific txn
and use pg_stat_xact_user_columns.
Yet some other users might be interested in audit trailing and want to log
pg_stat_xact_user_columns for each txn. Probably very expensive performance
wise, but might make sense if you have extremely sensitive data and audit
trailing is more important than performance.

Thoughts?

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-06-05 11:56:14 Re: RFC: Remove contrib entirely
Previous Message Thomas Munro 2015-06-05 10:51:53 Re: [GENERAL] 9.4.1 -> 9.4.2 problem: could not access status of transaction 1