Re: Expose lock group leader pid in pg_stat_activity

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Guillaume Lelarge <guillaume(at)lelarge(dot)info>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Expose lock group leader pid in pg_stat_activity
Date: 2020-01-30 13:03:01
Message-ID: 20200130130301.GC130922@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 28, 2020 at 02:52:08PM +0100, Tomas Vondra wrote:
> On Tue, Jan 28, 2020 at 02:26:34PM +0100, Julien Rouhaud wrote:
>> There were already some dependencies between the rows since parallel
>> queries were added, as you could see eg. a parallel worker while no
>> query is currently active. This patch will make those corner cases
>> more obvious.

I was reviewing the code and one thing that I was wondering is if it
would be better to make the code more defensive and return NULL when
the PID of the referenced leader is 0 or InvalidPid. However that
would mean that we have a dummy 2PC entry from PGPROC or something not
yet started, which makes no sense. So your simpler version is
actually fine. What you have here is that in the worst case you could
finish with an incorrect reference to shared memory if a PGPROC is
recycled between the moment you look for the leader field and the
moment the PID value is fetched. That's very unlikely to happen, and
I agree that it does not really justify the cost of taking extra locks
every time we scan pg_stat_activity.

> Yeah, sure. I mean explicit dependencies, e.g. a column referencing
> values from another row, like leader_id does.

+ The leader_pid is NULL for processes not involved in parallel query.
This is missing two markups, one for "NULL" and a second for
"leader_pid". The documentation does not match the surroundings
either, so I would suggest a reformulation for the beginning:
PID of the leader process if this process is involved in parallel query.

And actually this paragraph is not completely true, because leader_pid
remains set even after one parallel query run has been finished for a
session when leader_pid = pid as lockGroupLeader is set to NULL only
once the process is stopped in ProcKill().

>> Should I document the possible inconsistencies?
>
> I think it's worth mentioning that as a comment in the code, say before
> the pg_stat_get_activity function. IMO we don't need to document all
> possible inconsistencies, a generic explanation is enough.

Agreed that adding some information in the area when we look at the
PGPROC entries for wait events and such would be nice.

> Not sure about the user docs. Does it currently say anything about this
> topic - consistency with stat catalogs?

Not sure that it is the job of this patch to do that. Do you have
something specific in mind?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Asif Rehman 2020-01-30 13:39:51 Re: WIP/PoC for parallel backup
Previous Message Dilip Kumar 2020-01-30 12:39:52 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions