Re: Expose Parallelism counters planned/execute in pg_stat_statements

From: Anthony Sotolongo <asotolongo(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, daymelbonne(at)gmail(dot)com
Subject: Re: Expose Parallelism counters planned/execute in pg_stat_statements
Date: 2022-07-22 15:17:52
Message-ID: 864a7512-4e22-8d08-4fd9-51b0e04f07a8@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 21-07-22 20:35, Justin Pryzby wrote:
> On Thu, Jul 21, 2022 at 06:26:58PM -0400, Anthony Sotolongo wrote:
>> Hi all:
>> Here's a patch to add counters about  planned/executed  for parallelism  to
>> pg_stat_statements, as a way to follow-up on if the queries are
>> planning/executing with parallelism, this can help to understand if you have
>> a good/bad configuration or if your hardware is enough
> +1, I was missing something like this before, but it didn't occur to me to use
> PSS:

First of all, thanks for review the the patch and for the comments

> https://www.postgresql.org/message-id/20200310190142.GB29065@telsasoft.com
>> My hope is to answer to questions like these:
>>
>> . is query (ever? usually?) using parallel paths?
>> . is query usefully using parallel paths?
>> . what queries are my max_parallel_workers(_per_process) being used for ?
>> . Are certain longrunning or frequently running queries which are using
>> parallel paths using all max_parallel_workers and precluding other queries
>> from using parallel query ? Or, are semi-short queries sometimes precluding
>> longrunning queries from using parallelism, when the long queries would
>> better benefit ?
> This patch is storing the number of times the query was planned/executed using
> parallelism, but not the number of workers. Would it make sense to instead
> store the the *number* of workers launched/planned ? Otherwise, it might be
> that a query is consistently planned to use a large number of workers, but then
> runs with few. I'm referring to the fields shown in "explain/analyze". (Then,
> the 2nd field should be renamed to "launched").
>
> Workers Planned: 2
> Workers Launched: 2

The main idea of the patch is to store the number of times the
statements were planned and executed in parallel, not the number of
workers used in the execution. Of course, what you mention can be
helpful, it will be given a review to see how it can be achieved

>
> I don't think this is doing the right thing for prepared statements, like
> PQprepare()/PQexecPrepared(), or SQL: PREPARE p AS SELECT; EXECUTE p;
>
> Right now, the docs say that it shows the "number of times the statement was
> planned to use parallelism", but the planning counter is incremented during
> each execution. PSS already shows "calls" and "plans" separately. The
> documentation doesn't mention prepared statements as a reason why they wouldn't
> match, which seems like a deficiency.

We will check it and see how  fix it

>
> This currently doesn't count parallel workers used by utility statements, such
> as CREATE INDEX and VACUUM (see max_parallel_maintenance_workers). If that's
> not easy to do, mention that in the docs as a limitation.

We will update the documentation with information related to this comment

>
> You should try to add some test to contrib/pg_stat_statements/sql, or add
> parallelism test to an existing test. Note that the number of parallel workers
> launched isn't stable, so you can't test that part..
>
> You modified pgss_store() to take two booleans, but pass "NULL" instead of
> "false". Curiously, of all the compilers in cirrusci, only MSVC complained ..
>
> "planed" is actually spelled "planned", with two enns.
>
> The patch has some leading/trailing whitespace (maybe shown by git log
> depending on your configuration).

OK, we will fix it

> Please add this patch to the next commitfest.
> https://commitfest.postgresql.org/39/
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Borisov 2022-07-22 15:56:20 Re: Custom tuplesorts for extensions
Previous Message Dave Cramer 2022-07-22 15:00:18 Proposal to provide the facility to set binary format output for specific OID's per session