Re: expose parallel leader in CSV and log_line_prefix

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: expose parallel leader in CSV and log_line_prefix
Date: 2020-07-17 20:54:21
Message-ID: 20200717205421.GR23581@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 17, 2020 at 11:35:40AM -0400, Alvaro Herrera wrote:
> > On Fri, Jul 17, 2020 at 7:01 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> >
> > > Hmm. Knowing if a leader is actually running parallel query or not
> > > requires a lookup at lockGroupMembers, that itself requires a LWLock.
> > > I think that it would be better to not require that. So what if
> > > instead we logged %P only if Myproc has lockGroupLeader set and it
> > > does *not* match MyProcPid?
>
> That's what I said first, so +1 for that approach.

Ok, but should we then consider changing pg_stat_activity for consistency ?
Probably in v13 to avoid changing it a year later.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b025f32e0b5d7668daec9bfa957edf3599f4baa8

I think the story is that we're exposing to the user a "leader pid" what's
internally called (and used as) the "lock group leader", which for the leader
process is set to its own PID. But I think what we're exposing as leader_pid
will seem like an implementation artifact to users. It's unnatural to define a
leader PID for the leader itself, and I'm guessing that at least 30% of people
who use pg_stat_activity.leader_pid will be surprised by rows with
| backend_type='client backend' AND leader_pid IS NOT NULL
And maybe additionally confused if PSA doesn't match CSV or other log.

Right now, PSA will include processes "were leader" queries like:
| SELECT pid FROM pg_stat_activity WHERE pid=leader_pid
If we change it, I think you can get the same thing for a *current* leader like:
| SELECT pid FROM pg_stat_activity a WHERE EXISTS (SELECT 1 FROM pg_stat_activity b WHERE b.leader_pid=a.pid);
But once the children die, you can't get that anymore. Is that a problem ?

I didn't think of it until now, but it would be useful to query logs for
processes which were involved in parallel process. (It would be more useful if
it indicated the query, and not just the process)

I agree that showing the PID as the leader PID while using a connection pooler
is "noisy". But I think that's maybe just a consequence of connection pooling.
As an analogy, I would normally use a query like:
| SELECT session_line, message, query FROM postgres_log WHERE session_id='..' ORDER BY 1
But that already doesn't work usefully with connection pooling (and I'm not
sure how to resolve that other than by not using pooling when logs are useful)

I'm not sure what the answer. Probably we should either make both expose
lockGroupLeader exactly (and not filtered) or make both show lockGroupLeader
only if lockGroupLeader!=getpid().

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-07-17 21:27:21 Re: expose parallel leader in CSV and log_line_prefix
Previous Message Nikita Glukhov 2020-07-17 20:26:04 Re: SQL/JSON: functions