Re: Proposal: Save user's original authenticated identity for logging

From: Jacob Champion <pchampion(at)vmware(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc: "stark(at)mit(dot)edu" <stark(at)mit(dot)edu>, "magnus(at)hagander(dot)net" <magnus(at)hagander(dot)net>, "sfrost(at)snowman(dot)net" <sfrost(at)snowman(dot)net>, "tgl(at)sss(dot)pgh(dot)pa(dot)us" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Proposal: Save user's original authenticated identity for logging
Date: 2021-02-08 23:35:36
Message-ID: 5d8b74e64fe4a610b23624277e497cc0a925db4b.camel@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2021-02-02 at 22:22 +0000, Jacob Champion wrote:
> Given the feedback above, I'll continue to flesh out the PoC patch,
> focusing on 1) storing the identity in a single place for all auth
> methods and 2) exposing it consistently in the logs as part of
> log_connections.

Attached is a v1 patchset. Note that I haven't compiled or tested on
Windows and BSD yet, so the SSPI and BSD auth changes are eyeballed for
now.

The first two patches are preparatory, pulled from other threads on the
mailing list: 0001 comes from my Kerberos test fix thread [1], and 0002
is extracted from Andrew Dunstan's patch [2] to store the subject DN
from a client cert. 0003 has the actual implementation, which now fills
in port->authn_id for all auth methods.

Now that we're using log_connections instead of log_line_prefix,
there's more helpful information we can put into the log when
authentication succeeds. For now, I include the identity of the user,
the auth method in use, and the pg_hba.conf file and line number. E.g.

LOG: connection received: host=[local]
LOG: connection authenticated: identity="pchampion" method=peer (/data/pg_hba.conf:88)
LOG: connection authorized: user=admin database=postgres application_name=psql

If the overall direction seems good, then I have two questions:

- Since the authenticated identity is more or less an opaque string
that may come from a third party, should I be escaping it in some way
before it goes into the logs? Or is it generally accepted that log
files can contain arbitrary blobs in unspecified encodings?

- For the SSPI auth method, I pick the format of the identity string
based on the compatibility mode: "DOMAIN\user" when using compat_realm,
and "user(at)DOMAIN" otherwise. For Windows DBAs, is this a helpful way to
visualize the identity, or should I just stick to one format?

> I also
> plan to write a follow-up patch to add the authenticated identity to
> the statistics collector, with pg_get_authenticated_identity() to
> retrieve it.

This part turned out to be more work than I'd thought! Now I understand
why pg_stat_ssl truncates several fields to NAMEDATALEN.

Has there been any prior discussion on lifting that restriction for the
statistics collector as a whole, before I go down my own path? I can't
imagine taking up another 64 bytes per connection for a field that
won't be useful for the most common use cases -- and yet it still won't
be long enough for other users...

Thanks,
--Jacob

[1] https://www.postgresql.org/message-id/fe7a46f8d46ebb074ba1572d4b5e4af72dc95420.camel%40vmware.com
[2] https://www.postgresql.org/message-id/fd96ae76-a8e3-ef8e-a642-a592f5b76771%40dunslane.net

Attachment Content-Type Size
v1-0001-prep-test-kerberos-only-search-forward-in-logs.patch text/x-patch 1.9 KB
v1-0002-prep-add-port-peer_dn.patch text/x-patch 3.1 KB
v1-0003-Log-authenticated-identity-from-all-auth-backends.patch text/x-patch 28.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2021-02-08 23:38:45 Re: Single transaction in the tablesync worker?
Previous Message Noah Misch 2021-02-08 23:11:52 Re: 2021-02-11 release announcement draft