Re: [HACKERS] CLUSTER command progress monitor

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Subject: Re: [HACKERS] CLUSTER command progress monitor
Date: 2019-09-03 05:52:28
Message-ID: 20190903055228.GE3765@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 03, 2019 at 01:59:00PM +0900, Masahiko Sawada wrote:
> After more thought, even if we don't start a new command progress when
> there is another one already started the progress update functions
> could be called and these functions don't specify the command type.
> Therefore, the progress information might be changed with wrong value
> by different command. Probably we can change the caller of progress
> updating function so that it doesn't call all of them if the command
> could not start a new progress report, but it might be a big change.

That's one issue.

> As an alternative idea, we can make pgstat_progress_end_command() have
> one argument that is command the caller wants to end. That is, we
> don't end the command progress when the specified command type doesn't
> match to the command type of current running command progress. We
> unconditionally clear the progress information at CommitTransaction()
> and AbortTransaction() but we can do that by passing
> PROGRESS_COMMAND_INVALID to pgstat_progress_end_command().

Possibly. I don't dislike the idea of piling up the progress
information for cascading calls and I would use that with a depth
counter and a fixed-size array.

> BTW the following condition in pgstat_progress_end_command() seems to
> be wrong. We should return from the function when either
> pgstat_track_activities is disabled or the current progress command is
> invalid.
>
> if (!pgstat_track_activities
> && beentry->st_progress_command == PROGRESS_COMMAND_INVALID)
> return;
>
> I've attached the patch fixes the issue I newly found.

Indeed, good catch. This is wrong since b6fb647 which has introduced
the progress reports. I'll fix that one and back-patch if there are
no objections.

With my RMT hat on for v12, I don't think that it is really the moment
to discuss how we want to change this API post beta3, and we have room
for that in future development cycles. There are quite some questions
which need answers I am unsure of:
- Do we really want to support nested calls of progress reports for
multiple command?
- Perhaps for some commands it makes sense to have an overlap of the
fields used, but we need a clear definition of what can be done or
not. I am not really comfortable with the idea of having in
reindex_relation() a progress report related only to CLUSTER, which is
also a REINDEX code path. The semantics shared between both commands
need to be thought a bit more. For example
PROGRESS_CLUSTER_INDEX_REBUILD_COUNT could cause the system catalog to
report PROGRESS_CREATEIDX_PHASE_WAIT_3 because of an incorrect command
type, which would be just wrong for a CLUSTER command.
- Which command should be reported to the user, only the upper-level
one?
- Perhaps we can live only with the approach of not registering a new
command if one already exists, and actually be more flexible with the
phase fields used, in short we use unique numbers for each phase?

The most conservative bet from a release point of view, and actually
my bet because that's safe, would be to basically revert 6f97457
(CLUSTER), 03f9e5c (REINDEX) and ab0dfc96 (CREATE INDEX which has
overlaps with REINDEX in the build and validation paths). What I am
really scared of is that we have just barely scratched the surface of
the issues caused by the inter-dependencies between all the code
paths of those commands, and that we have much more waiting behind
this open item.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-09-03 05:54:21 Re: pgbench - rework variable management
Previous Message Erik Rijkers 2019-09-03 05:50:29 Re: row filtering for logical replication