Re: [HACKERS] CLUSTER command progress monitor

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tatsuro Yamada <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] CLUSTER command progress monitor
Date: 2017-11-20 17:25:53
Message-ID: 16114.1511198753@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Antonin Houska <ah(at)cybertec(dot)at> writes:
> Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> These two phases overlap, though. I believe progress reporting for
>> sorts is really hard.

> Whatever complexity is hidden in the sort, cost_sort() should have taken it
> into consideration when called via plan_cluster_use_sort(). Thus I think that
> once we have both startup and total cost, the current progress of the sort
> stage can be estimated from the current number of input and output
> rows. Please remind me if my proposal appears to be too simplistic.

Well, even if you assume that the planner's cost model omits nothing
(which I wouldn't bet on), its result is only going to be as good as the
planner's estimate of the number of rows to be sorted. And, in cases
where people actually care about progress monitoring, it's likely that
the planner got that wrong, maybe horribly so. I think it's a bad idea
for progress monitoring to depend on the planner's estimates in any way
whatsoever.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martín Marqués 2017-11-20 17:47:54 Re: [HACKERS] pg_basebackup --progress output for batch execution
Previous Message Antonin Houska 2017-11-20 17:05:25 Re: [HACKERS] CLUSTER command progress monitor