|From:||Tatsuro Yamada <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp>|
|To:||Robert Haas <robertmhaas(at)gmail(dot)com>|
|Cc:||Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>|
|Subject:||Re: [HACKERS] CLUSTER command progress monitor|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On 2019/02/23 6:02, Robert Haas wrote:
> On Fri, Dec 28, 2018 at 3:20 AM Tatsuro Yamada
> <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> This patch is rebased on HEAD.
>> I'll tackle revising the patch based on feedbacks next month.
> + Running <command>VACUUM FULL</command> is listed in
> + view because it uses <command>CLUSTER</command> command
> internally. See <xref linkend='cluster-progress-reporting'>.
> It's not really true to say that VACUUM FULL uses the CLUSTER command
> internally. It's not really true. It uses a good chunk of the same
> infrastructure, but it certainly doesn't use the actual command, and
> it's not really the exact same thing either, because internally it's
> doing a sequential scan but no sort, which never happens with CLUSTER.
> I'm not sure exactly how to rephrase this, but I think we need to make
> it more precise.
> One idea is that maybe we should try to think of a design that could
> also handle the rewriting variants of ALTER TABLE, and call it
> pg_stat_progress_rewrite. Maybe that's moving the goalposts too far,
> but I'm not saying we'd necessarily have to do all the work now, just
> have a view that we think could also handle that. Then again, maybe
> the needs are too different.
Hmm..., I see.
If possible, I'd like to stop thinking of VACUUM FULL to avoid complication of
For now, I haven't enough time to design pg_stat_progress_rewrite. I suppose that
it's tough work.
> + Whenever <command>CLUSTER</command> is running, the
> + <structname>pg_stat_progress_cluster</structname> view will contain
> + one row for each backend that is currently clustering or vacuuming
> (VACUUM FULL).
> That sentence contradicts itself. Just say that it contains a row for
> each backend that is currently running CLUSTER or VACUUM FULL.
> @@ -105,6 +107,7 @@ static void reform_and_rewrite_tuple(HeapTuple tuple,
> cluster(ClusterStmt *stmt, bool isTopLevel)
> if (stmt->relation != NULL)
> /* This is the single-relation case. */
> Useless hunk.
> @@ -186,7 +189,9 @@ cluster(ClusterStmt *stmt, bool isTopLevel)
> heap_close(rel, NoLock);
> /* Do the job. */
> + pgstat_progress_start_command(PROGRESS_COMMAND_CLUSTER, tableOid);
> cluster_rel(tableOid, indexOid, stmt->options);
> + pgstat_progress_end_command();
> It seems like that stuff should be inside cluster_rel().
> + /* Set reltuples to total_tuples */
> + pgstat_progress_update_param(PROGRESS_CLUSTER_TOTAL_HEAP_TUPLES,
> I object. If the user wants that, they can get it from pg_class
> themselves via an SQL query. It's also an estimate, not something we
> know to be accurate; I want us to only report facts here, not theories
I understand that progress monitor should only report facts, so I
removed that code.
> + pgstat_progress_update_param(PROGRESS_CLUSTER_PHASE,
> + pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD,
> I think you should use pgstat_progress_update_multi_param() if
> updating multiple parameters at the same time.
> Also, some lines in this patch, such as this one, are very long.
> Consider techniques to reduce the line length to 80 characters or
> less, such as inserting a line break between the two arguments.
> + /* Set scan_method to NULL */
> + pgstat_progress_update_param(PROGRESS_CLUSTER_SCAN_METHOD, -1);
> NULL and -1 are not the same thing.
> I think that we shouldn't have both
> PROGRESS_CLUSTER_PHASE_SCAN_HEAP_AND_WRITE_NEW_HEAP and
> PROGRESS_CLUSTER_PHASE_SCAN_HEAP. They're the same thing. Let's just
> use PROGRESS_CLUSTER_PHASE_SCAN_HEAP for both. Actually, better yet,
> let's get rid of PROGRESS_CLUSTER_SCAN_METHOD and have
> PROGRESS_CLUSTER_PHASE_SEQ_SCAN_HEAP and
> PROGRESS_CLUSTER_PHASE_INDEX_SCAN_HEAP. That seems noticeably
> I agree that it's acceptable to report
> PROGRESS_CLUSTER_HEAP_TUPLES_VACUUMED and
> PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD, but I'm not sure I
> understand why it's valuable to do so in the context of a progress
Actually, I'm not sure why I added it since so much time has passed. :(
So, I'll remove PROGRESS_CLUSTER_HEAP_TUPLES_RECENTLY_DEAD at least.
Attached patch is wip patch.
|Next Message||Masahiko Sawada||2019-03-01 05:16:40||Re: [HACKERS] Block level parallel vacuum|
|Previous Message||Tom Lane||2019-03-01 04:50:16||Re: pg_partition_tree crashes for a non-defined relation|