Re: Expose Parallelism counters planned/execute in pg_stat_statements

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Anthony Sotolongo <asotolongo(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, daymelbonne(at)gmail(dot)com
Subject: Re: Expose Parallelism counters planned/execute in pg_stat_statements
Date: 2022-07-22 00:35:56
Message-ID: 20220722003556.GT12702@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 21, 2022 at 06:26:58PM -0400, Anthony Sotolongo wrote:
> Hi all:
> Here's a patch to add counters about  planned/executed  for parallelism  to
> pg_stat_statements, as a way to follow-up on if the queries are
> planning/executing with parallelism, this can help to understand if you have
> a good/bad configuration or if your hardware is enough

+1, I was missing something like this before, but it didn't occur to me to use
PSS:

https://www.postgresql.org/message-id/20200310190142.GB29065@telsasoft.com
> My hope is to answer to questions like these:
>
> . is query (ever? usually?) using parallel paths?
> . is query usefully using parallel paths?
> . what queries are my max_parallel_workers(_per_process) being used for ?
> . Are certain longrunning or frequently running queries which are using
> parallel paths using all max_parallel_workers and precluding other queries
> from using parallel query ? Or, are semi-short queries sometimes precluding
> longrunning queries from using parallelism, when the long queries would
> better benefit ?

This patch is storing the number of times the query was planned/executed using
parallelism, but not the number of workers. Would it make sense to instead
store the the *number* of workers launched/planned ? Otherwise, it might be
that a query is consistently planned to use a large number of workers, but then
runs with few. I'm referring to the fields shown in "explain/analyze". (Then,
the 2nd field should be renamed to "launched").

Workers Planned: 2
Workers Launched: 2

I don't think this is doing the right thing for prepared statements, like
PQprepare()/PQexecPrepared(), or SQL: PREPARE p AS SELECT; EXECUTE p;

Right now, the docs say that it shows the "number of times the statement was
planned to use parallelism", but the planning counter is incremented during
each execution. PSS already shows "calls" and "plans" separately. The
documentation doesn't mention prepared statements as a reason why they wouldn't
match, which seems like a deficiency.

This currently doesn't count parallel workers used by utility statements, such
as CREATE INDEX and VACUUM (see max_parallel_maintenance_workers). If that's
not easy to do, mention that in the docs as a limitation.

You should try to add some test to contrib/pg_stat_statements/sql, or add
parallelism test to an existing test. Note that the number of parallel workers
launched isn't stable, so you can't test that part..

You modified pgss_store() to take two booleans, but pass "NULL" instead of
"false". Curiously, of all the compilers in cirrusci, only MSVC complained ..

"planed" is actually spelled "planned", with two enns.

The patch has some leading/trailing whitespace (maybe shown by git log
depending on your configuration).

Please add this patch to the next commitfest.
https://commitfest.postgresql.org/39/

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gurjeet Singh 2022-07-22 00:39:35 Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
Previous Message Kyotaro Horiguchi 2022-07-22 00:20:37 Re: standby recovery fails (tablespace related) (tentative patch and discussion)