CLUSTER command progress monitor

From: Tatsuro Yamada <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: CLUSTER command progress monitor
Date: 2017-08-31 02:12:02
Message-ID: 59A77072.3090401@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi,

Following is a proposal for reporting the progress of CLUSTER command:

It seems that the following could be the phases of CLUSTER processing:

1. scanning heap
2. sort tuples
3. writing new heap
4. scan heap and write new heap
5. swapping relation files
6. rebuild index
7. performing final cleanup

These phases are based on Rahila's presentation at PGCon 2017
(https://www.pgcon.org/2017/schedule/attachments/472_Progress%20Measurement%20PostgreSQL.pdf)
and I added some phases on it.

CLUSTER command may use Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

* Seq Scan
1. scanning heap
2. sort tuples
3. writing new heap
5. swapping relation files
6. rebuild index
7. performing final cleanup

* Index Scan
4. scan heap and write new heap
5. swapping relation files
6. rebuild index
7. performing final cleanup

The view provides the information of CLUSTER command progress details as follows
postgres=# \d pg_stat_progress_cluster
View "pg_catalog.pg_stat_progress_cluster"
Column | Type | Collation | Nullable | Default
---------------------+---------+-----------+----------+---------
pid | integer | | |
datid | oid | | |
datname | name | | |
relid | oid | | |
phase | text | | |
scan_method | text | | |
scan_index_relid | bigint | | |
heap_tuples_total | bigint | | |
heap_tuples_scanned | bigint | | |

Then I have questions.

* Should we have separate views for them? Or should both be covered by the
same view with some indication of which command (CLUSTER or VACUUM FULL)
is actually running?
I mean this progress monitor could be covering not only CLUSTER command but also
VACUUM FULL command.

* I chose tuples as scan heap's counter (heap_tuples_scanned) since it's not
easy to get current blocks from Index Scan. Is it Ok?

I'll add this patch to CF2017-09.
Any comments or suggestion are welcome.

Regards,
Tatsuro Yamada
NTT Open Source Software Center

Attachment Content-Type Size
progress_monitor_cluster_v1.patch text/x-patch 16.1 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-08-31 02:23:01 document and use SPI_result_code_string()
Previous Message Bossart, Nathan 2017-08-31 01:52:13 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands