Re: [HACKERS] CLUSTER command progress monitor

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuro Yamada <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] CLUSTER command progress monitor
Date: 2018-12-18 21:01:59
Message-ID: 20181218210159.xtkltzm7flrwsm55@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-Nov-21, Peter Geoghegan wrote:

> On Mon, Oct 2, 2017 at 6:04 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > Progress reporting on sorts seems like a tricky problem to me, as I
> > said before. In most cases, a sort is going to involve an initial
> > stage where it reads all the input tuples and writes out quicksorted
> > runs, and then a merge phase where it merges all the output tapes into
> > a sorted result. There are some complexities; for example, if the
> > number of tapes is really large, then we might need multiple merge
> > phases, only the last of which will produce tuples.
>
> This would ordinarily be the point at which I'd say "but you're very
> unlikely to require multiple passes for an external sort these days".
> But I won't say that on this thread, because CLUSTER generally has
> unusually wide tuples, and so is much more likely to be I/O bound, to
> require multiple passes, etc. (I bet the v10 enhancements
> disproportionately improved CLUSTER performance.)

When the seqscan-and-sort strategy is used, we feed tuplesort with every
tuple from the scan. Once that's completed, we call `performsort`, then
retrieve tuples.

If we see this in terms of tapes and merges, we can report the total
number of each of those that we have completed. As far as I understand,
we write one tape to completion, and only then start another one, right?
Since there's no way to know how many tapes/merges are needed in total,
it's not possible to compute a percentage of completion. That's seems
okay -- we're just telling the user that progress is being made, and we
only report facts not theory. Perhaps we can (also?) indicate disk I/O
utilization, in terms of the number of blocks written by tuplesort.

I suppose that in order to have tuplesort.c report progress, we would
have to have some kind of API that tuplesort would invoke internally to
indicate events such as "tape started/completed", "merge started/completed".
One idea is to use a callback system; each tuplesort caller could
optionally pass a callback to the "begin" function, for progress
reporting purposes. Initially only cluster.c would use it, but I
suppose eventually every tuplesort caller would want that.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-12-18 21:08:33 Re: cfbot run pgindent?
Previous Message Robbie Harwood 2018-12-18 19:12:46 [PATCH v20] GSSAPI encryption support