Re: [HACKERS] CLUSTER command progress monitor

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuro Yamada <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] CLUSTER command progress monitor
Date: 2018-12-18 22:25:12
Message-ID: CAH2-Wzn0gDwAZuOPHrxxzAcdwTvD60oghmR3aCDLNATjhf4rEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 18, 2018 at 1:02 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> If we see this in terms of tapes and merges, we can report the total
> number of each of those that we have completed. As far as I understand,
> we write one tape to completion, and only then start another one, right?
> Since there's no way to know how many tapes/merges are needed in total,
> it's not possible to compute a percentage of completion. That's seems
> okay -- we're just telling the user that progress is being made, and we
> only report facts not theory. Perhaps we can (also?) indicate disk I/O
> utilization, in terms of the number of blocks written by tuplesort.

The number of blocks tuplesort uses is constant from the end of
initial run generation, since logtape.c will recycle blocks.

> I suppose that in order to have tuplesort.c report progress, we would
> have to have some kind of API that tuplesort would invoke internally to
> indicate events such as "tape started/completed", "merge started/completed".
> One idea is to use a callback system; each tuplesort caller could
> optionally pass a callback to the "begin" function, for progress
> reporting purposes. Initially only cluster.c would use it, but I
> suppose eventually every tuplesort caller would want that.

I think that you could have a callback that did something with the
information currently reported by trace_sort. That's not a bad way of
scoping the problem. That's how I myself monitor the progress of a
sort, and it works pretty well (whether or not that means other people
can do it is not exactly clear to me).

We predict the number of merge passes within cost_sort() already. That
doesn't seem all that hard to generalize, so that you report the
expected number of passes against the current pass. Some passes are
much quicker than others, but you generally don't have that many with
realistic cases. I don't expect that it will work very well with an
internal sort, but in the case of CLUSTER that almost seems
irrelevant. And maybe even in all cases.

I think that the user is going to have to be willing to develop some
intuition about the progress for it to be all that useful. They're
really looking for something that gives a clue if they'll have to wait
an hour, a day, or a week, which it seems like trace_sort-like
information gives you some idea of. (BTW, dtrace probes can already
give the user much the same information -- I think that more people
should use those, since tracing technology on Linux has improved
drastically in the last few years.)

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-12-18 22:26:24 Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)
Previous Message Peter Geoghegan 2018-12-18 22:17:54 Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)