Re: [HACKERS] CLUSTER command progress monitor

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuro Yamada <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] CLUSTER command progress monitor
Date: 2018-12-18 22:47:36
Message-ID: 20181218224736.6e5jakieqigjlt7f@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018-Dec-18, Peter Geoghegan wrote:

> On Tue, Dec 18, 2018 at 1:02 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
> > If we see this in terms of tapes and merges, we can report the total
> > number of each of those that we have completed. As far as I understand,
> > we write one tape to completion, and only then start another one, right?
> > Since there's no way to know how many tapes/merges are needed in total,
> > it's not possible to compute a percentage of completion. That's seems
> > okay -- we're just telling the user that progress is being made, and we
> > only report facts not theory. Perhaps we can (also?) indicate disk I/O
> > utilization, in terms of the number of blocks written by tuplesort.
>
> The number of blocks tuplesort uses is constant from the end of
> initial run generation, since logtape.c will recycle blocks.

Well, if you think about individual blocks in terms of storage space,
maybe that's true, but I meant in an Heraclitus way of men never
stepping into the same river -- the second time you write the block,
it's not the same block you wrote before, so you count it twice. It's
not the actual disk space utilization that matters, but how much I/O
have you done (even if it is just to kernel cache, I suppose).

> > I suppose that in order to have tuplesort.c report progress, we would
> > have to have some kind of API that tuplesort would invoke internally to
> > indicate events such as "tape started/completed", "merge started/completed".
> > One idea is to use a callback system; each tuplesort caller could
> > optionally pass a callback to the "begin" function, for progress
> > reporting purposes. Initially only cluster.c would use it, but I
> > suppose eventually every tuplesort caller would want that.
>
> I think that you could have a callback that did something with the
> information currently reported by trace_sort. That's not a bad way of
> scoping the problem. That's how I myself monitor the progress of a
> sort, and it works pretty well (whether or not that means other people
> can do it is not exactly clear to me).

Thanks, that looks useful.

I suppose mapping such numbers to actual progress is a bit of an art (or
intuition as you say), but it seems to be the best we can do, if we do
anything at all.

> We predict the number of merge passes within cost_sort() already. That
> doesn't seem all that hard to generalize, so that you report the
> expected number of passes against the current pass. Some passes are
> much quicker than others, but you generally don't have that many with
> realistic cases. I don't expect that it will work very well with an
> internal sort, but in the case of CLUSTER that almost seems
> irrelevant. And maybe even in all cases.

How good are those predictions? The feeling I get from this thread is
that if the estimation of the number of passes is unreliable, it's
better not to report it at all; just return how many we've done thus
far. It's undesirable to report that we're about 150% done (or take
hours to get to 40% done, then suddenly be over).

I wonder if internal sorts are really all that interesting from the PoV
of progress reporting. Also, I have the impression that quicksort isn't
very amenable to letting you know how much work is left.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2018-12-18 23:04:58 Doc typo?
Previous Message Peter Geoghegan 2018-12-18 22:29:14 Re: Fixing findDependentObjects()'s dependency on scan order (regressions in DROP diagnostic messages)