Re: CLUSTER command progress monitor

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Tatsuro Yamada <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: CLUSTER command progress monitor
Date: 2017-10-02 00:29:12
Message-ID: A30CBB92-970B-4891-87D4-1648605A6E61@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> On 12 Sep 2017, at 14:57, Tatsuro Yamada <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
> On 2017/09/12 21:20, Tatsuro Yamada wrote:
>> On 2017/09/11 23:38, Robert Haas wrote:
>>> On Sun, Sep 10, 2017 at 10:36 PM, Tatsuro Yamada
>>> <yamada(dot)tatsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>>> Thanks for the comment.
>>>>
>>>> As you know, CLUSTER command uses SEQ SCAN or INDEX SCAN as a scan method by
>>>> cost estimation. In the case of SEQ SCAN, these two phases not overlap.
>>>> However, in INDEX SCAN, it overlaps. Therefore I created the phase of "scan
>>>> heap and write new heap" when INDEX SCAN was selected.
>>>>
>>>> I agree that progress reporting for sort is difficult. So it only reports
>>>> the phase ("sorting tuples") in the current design of progress monitor of
>>>> cluster.
>>>> It doesn't report counter of sort.
>>>
>>> Doesn't that make it almost useless? I would guess that scanning the
>>> heap and writing the new heap would ordinarily account for most of the
>>> runtime, or at least enough that you're going to want something more
>>> than just knowing that's the phase you're in.
>>
>> Hmmm, Should I add a counter in tuplesort.c? (tuplesort_performsort())
>> I know that external merge sort takes a time than quick sort.
>> I'll try investigating how to get a counter from external merge sort processing.
>> Is this the right way?
>>
>>
>>>>> The patch is getting the value reported as heap_tuples_total from
>>>>> OldHeap->rd_rel->reltuples. I think this is pointless: the user can
>>>>> see that value anyway if they wish. The point of the progress
>>>>> counters is to expose things the user couldn't otherwise see. It's
>>>>> also not necessarily accurate: it's only an estimate in the best case,
>>>>> and may be way off if the relation has recently be extended by a large
>>>>> amount. I think it's pretty important that we try hard to only report
>>>>> values that are known to be accurate, because users hate (and mock)
>>>>> inaccurate progress reports.
>>>>
>>>> Do you mean to use the number of rows by using below calculation instead
>>>> OldHeap->rd_rel->reltuples?
>>>>
>>>> estimate rows = physical table size / average row length
>>>
>>> No, I mean don't report it at all. The caller can do that calculation
>>> if they wish, without any help from the progress reporting machinery.
>>
>> I see. I'll remove that column on next patch.
>
>
> I will summarize the current design and future corrections before sending
> the next patch.
>
>
> === Current design ===
>
> CLUSTER command may use Index Scan or Seq Scan when scanning the heap.
> Depending on which one is chosen, the command will proceed in the
> following sequence of phases:
>
> * Scan method: Seq Scan
> 1. scanning heap (*1)
> 2. sorting tuples (*2)
> 3. writing new heap (*1)
> 5. swapping relation files (*2)
> 6. rebuilding index (*2)
> 7. performing final cleanup (*2)
>
> * Scan method: Index Scan
> 4. scan heap and write new heap (*1)
> 5. swapping relation files (*2)
> 6. rebuilding index (*2)
> 7. performing final cleanup (*2)
>
> VACUUM FULL command will proceed in the following sequence of phases:
>
> 1. scanning heap (*1)
> 5. swapping relation files (*2)
> 6. rebuilding index (*2)
> 7. performing final cleanup (*2)
>
> (*1): increasing the value in heap_tuples_scanned column
> (*2): only shows the phase in the phase column
>
> The view provides the information of CLUSTER command progress details as follows
> # \d pg_stat_progress_cluster
> View "pg_catalog.pg_stat_progress_cluster"
> Column | Type | Collation | Nullable | Default
> ---------------------------+---------+-----------+----------+---------
> pid | integer | | |
> datid | oid | | |
> datname | name | | |
> relid | oid | | |
> command | text | | |
> phase | text | | |
> scan_method | text | | |
> scan_index_relid | bigint | | |
> heap_tuples_total | bigint | | |
> heap_tuples_scanned | bigint | | |
> heap_tuples_vacuumed | bigint | | |
> heap_tuples_recently_dead | bigint | | |
>
>
> === It will be changed on next patch ===
>
> - Rename to pg_stat_progress_reolg from pg_stat_progress_cluster
> - Remove heap_tuples_total column from the view
> - Add a progress counter in the phase of "sorting tuples" (difficult?!)
>
>
> === My test case as a bonus ===
>
> I share my test case of progress monitor.
> If someone wants to watch the current progress monitor, you can use
> this test case as a example.
>
> [Terminal1]
> Run this query on psql:
>
> select * from pg_stat_progress_cluster; \watch 0.05
>
> [Terminal2]
> Run these queries on psql:
>
> drop table t1;
>
> create table t1 as select a, random() * 1000 as b from generate_series(0, 99999999) a;
> create index idx_t1 on t1(a);
> create index idx_t1_b on t1(b);
> analyze t1;
>
> -- index scan
> set enable_seqscan to off;
> cluster verbose t1 using idx_t1;
>
> -- seq scan
> set enable_seqscan to on;
> set enable_indexscan to off;
> cluster verbose t1 using idx_t1;
>
> -- only given table name to cluster command
> cluster verbose t1;
>
> -- only cluster command
> cluster verbose;
>
> -- vacuum full
> vacuum full t1;
>
> -- vacuum full
> vacuum full;

Based on this thread, this patch has been marked Returned with Feedback.
Please re-submit a new version to a future commitfest.

cheers ./daniel

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2017-10-02 00:33:12 Re: Optional message to user when terminating/cancelling backend
Previous Message Robert Haas 2017-10-02 00:22:21 Re: Fwd: Have a problem with citext