Re: [PATCH v2] Progress command to monitor progression of long running SQL queries

From: Remi Colinet <remi(dot)colinet(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH v2] Progress command to monitor progression of long running SQL queries
Date: 2017-05-11 16:09:36
Message-ID: CADdR5nxZ8kxMpmcnAXnKYUe8Mdx6J-Hxo08sPLBqQdfYk=18rQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2017-05-11 4:05 GMT+02:00 Michael Paquier <michael(dot)paquier(at)gmail(dot)com>:

> On Thu, May 11, 2017 at 1:40 AM, Remi Colinet <remi(dot)colinet(at)gmail(dot)com>
> wrote:
> > This is version 2 of the new command name PROGRESS which I wrote in
> order to
> > monitor long running SQL queries in a Postgres backend process.
>
> It would be a good idea to add this patch to the next commit fest if
> you are expecting some feedback:
> https://commitfest.postgresql.org/14/
> But please don't expect immediate feedback, the primary focus these
> days is to stabilize the upcoming release.
>

Yes, I understand. I will submit the patch to the commit fest once I will
have had a few feedbacks.
I already received some interesting suggestions such as using a SQL
function instead of a whole new command.

The purpose for the moment is to gather feedback and advise about such
feature.

>
> > New command justification
> > ====================
> >
> > [root(at)rco pg]# git diff --stat master..
> > [...]
> > 58 files changed, 5972 insertions(+), 2619 deletions(-)
>
> For your first patch, you may want something less... Challenging.
> Please note as well that it would be nice if you review other patches
> to get more familiar with the community process, it is expected that
> for each patch submitted, an equivalent amount of review is done.
> That's hard to measure but for a patch as large as that you will need
> to pick up at least one patch equivalent in difficulty.
>

Acked

> Regarding the patch, this is way to close to the progress facility
> already in place. So why don't you extend it for the executor? We can
> likely come up with something that can help, though I see the part
> where the plan run by a backend is shared among all processes as a
> major bottleneck in performance. You have at least three concepts
> different here:
> - Store plans run in shared memory to allow access to any other processes.
> - Allow a process to look at the plan run by another one with a SQL
> interface.
> - Track the progress of a running plan, via pg_stat_activity.
> All in all, I think that a new command is not adapted, and that you
> had better split each concept before implementing something
> over-engineered like the patch attached.
>

I initially had a look at the progress facility but this seemed to me very
far from the current goal.
Btw, I will reconsider it. Things may have changed.

The plan and its planstate are not permanently shared with other backends.
This would be useless and very costly.

As long as the PROGRESS command is not used, the patch has a very low
impact on the executor.
A few counters and flags are added/updated by the patch in the executor.
This is all what is needed to track progress of execution.

For instance in src/backend/executor/execProcnode.c, I've added:

ExecInitNode()
+ result->plan_rows = 0;

ExecProcNode()
+ node->plan_rows++;

ExecEndNode()
+ node->plan_rows = 0;

This is enough to compute the number of rows fetched by the plan at this
step of the plan.
A few counters have been added to track sorts, scans, hashes, tuple stores.

The disk space used by nodes is also traced. This shows disk resource
needed to complete a SQL query.
This was not initially a purpose of the patch, but I find this very handy.

Some debugging code will also be removed making the change in the executor
very small.
The code change in the executor is less than 50 lines and it is about 150
lines in the tuple sort/store code mainly to add functions to fetch sort
and store status.

The execution tree is dumped in shared memory only when requested by a
monitoring backend sending a signal to the monitored backend.
The monitoring backend first sends a signal which is noted by the backend
running the long SQL query.
Once the long SQL query can deal with the request (interrupt enabled), it
dumps its execution tree in shared memory and then follows up its
execution.

So the dump of the execution tree is done only when it is requested.
This does not seem to incur any performance hit. Or I missed something.

Once the concept is mature, I would split the patch in small consistent
parts to allow a smooth and clean merge.

Regards
Remi

> --
> Michael
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Remi Colinet 2017-05-11 16:15:56 Re: [PATCH v2] Progress command to monitor progression of long running SQL queries
Previous Message Dilip Kumar 2017-05-11 16:02:11 Re: [POC] hash partitioning