Re: Addition of authenticated ID to pg_stat_activity

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Stephen Frost <sfrost(at)snowman(dot)net>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Addition of authenticated ID to pg_stat_activity
Date: 2021-05-18 02:20:49
Message-ID: YKMkgcJtt7MxmJop@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 17, 2021 at 10:28:49AM +0200, Magnus Hagander wrote:
> On Mon, May 17, 2021 at 6:35 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> Not sure if we would be able to agree on something here, but the
>> barrier to what a session and a connection hold is thin when it comes
>> to roles and application_name. Thinking more about that, I would be
>> really tempted to get to do a more straight split with data that's
>> associated to a session, to a transaction and to a connection, say:
>> 1) pg_stat_session, data that may change.
>> - PID
>> - leader PID
>> - the role name
>> - role ID
>> - application_name
>> - wait_event_type
>> - wait_event
>> 2) pg_stat_connection, static data associated to a connection.
>> - PID
>> - database name
>> - database OID
>> - client_addr
>> - client_hostname
>> - client_port
>> - backend_start
>> - authn ID
>> - backend_type
>> 3) pg_stat_transaction, or pg_stat_activity, for the transactional
>> activity.
>> - PID
>> - xact_start
>> - query_start
>> - backend_xid
>> - state_change
>> - query string
>> - query ID
>> - state
>
> This seems extremely user-unfriendly to me.
>
> I mean. Timestamps are nso split out between different views so you
> can't track the process iwthout it. And surely wait_event info is
> *extremely* related to things like what query is running and what
> state the session is in. And putting backend_type off in a separate
> view means most queries are going to have to join that in anyway. Or
> include it in all views. And if we're forcing the majority of queries
> to join multiple views, what have we actually gained?
>
> Based on your list above, I'd definitely want at least (1) and (2) to
> be in the same one, but they'd have to also gain at least the database
> oid/name and backend_type, and maybe also backend_start.

Okay.

> So basically, it would be moving out client_*, and authn_id.

So that would mean the addition of one new catalog view, called
pg_stat_connection, with the following fields:
- PID
- all three client_*
- authn ID
I can live with this split. Thoughts from others?

> If we're
> doing that then as you say maybe pg_stat_connection is a good name and
> could then *also* gain the information that's currently in the ssl and
> gss views for a net simplification.

I am less enthutiastic about this addition. SSL and GSSAPI have no
fields in common, so that would bloat the view for connection data
with mostly NULL fields most of the time.

> tld;dr; I think we have to be really careful here or the cure is going
> to be way worse than the disease.

Agreed.

>> Or I could just drop a new function that fetches the authn ID for a
>> given PID, even if this makes things potentially less consistent when
>> it comes to the lookup of PgBackendStatus, guarantee given now by
>> pg_stat_get_activity().
>
> Well, the authnid will never change so I'm not sure the consistency
> part is a big problem? Or maybe I'm misunderstanding the risk you're
> referring to?

I just mean to keep the consistency we have now with one single call
of pg_stat_get_activity() for each catalog view, so as we still fetch
once a consistent copy of all PgBackendStatus entries in this code
path.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-05-18 02:25:37 Re: What is lurking in the shadows?
Previous Message Masahiko Sawada 2021-05-18 02:20:07 Re: Performance degradation of REFRESH MATERIALIZED VIEW