Re: progress report for ANALYZE

From: Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: progress report for ANALYZE
Date: 2019-11-27 03:45:41
Message-ID: 725ba79a-fdfd-256a-a1ac-ace2a53b47b2@nttcom.co.jp_1
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Amit-san!

>> I think include_children and current_relid are not enough to
>> understand the progress of analyzing inheritance trees, because even
>> with current_relid being updated, I can't tell how many more there
>> will be. I think it'd be better to show the total number of children
>> and the number of children processed, just like
>> pg_stat_progress_create_index does for partitions. So, instead of
>> include_children and current_relid, I think it's better to have
>> child_tables_total, child_tables_done, and current_child_relid, placed
>> last in the set of columns.
>
> Ah, I understood.
> I'll check pg_stat_progress_create_index does for partitions,
> and will create a new patch.

Fixed.

But I just remembered I replaced column name "*_table" with "*_relid"
based on Robert's comment three months ago, see below:

> /me reviews.
>
> + <entry><structfield>scanning_table</structfield></entry>
>
> I think this should be retitled to something that ends in 'relid',
> like all of the corresponding cases in existing progress views.
> Perhaps 'active_relid' or 'current_relid'.

So, it would be better to use same rule against child_tables_total and
child_tables_done. Thus I changed these column names to others and added
to the view. I also removed include_children and current_relid.
The following columns are new version.

<New columns of the view>
pid
datid
datname
relid
phase
sample_blks_total
heap_blks_scanned
ext_stats_total <= Added (based on Alvaro's comment)
ext_stats_computed <= Renamed
child_relids_total <= Added
child_relids_done <= Added
current_child_relid <= Added

>> Also, inheritance tree stats are created *after* creating single table
>> stats, so I think that it would be better to have a distinct phase
>> name for that, say "acquiring inherited sample rows". In
>> do_analyze_rel(), you can select which of two phases to set based on
>> whether inh is true or not. For partitioned tables, the progress
>> output will immediately switch to this phase, because partitioned
>> table itself is empty so there's nothing to do in the "acquiring
>> sample rows" phase.
>>
>> That's all for now.
>
>
> Okay! I'll also add the new phase "acquiring inherited sample rows" on
> the next patch.

Fixed.

I tried to abbreviate it to "acquiring inh sample rows" because I thought
"acquiring inherited sample rows" is a little long for the phase name.

Attached WIP patch is including these fixes:
- Remove columns: include_children and current_relid
- Add new columns: child_relieds_total, child_relids_done and current_child_relid
- Add new phase "acquiring inh sample rows"

Note: the document is not updated, I'll fix it later. :)

Attached testcase.sql is for creating base table and partitioning table.
You can check the patch by using the following procedures, easily.

Terminal #1
--------------
\a \t
select * from pg_stat_progress_analyze; \watch 0.0001
--------------

Terminal #2
--------------
\i testcase.sql
analyze hoge;
analyze hoge2;
--------------

Thanks,
Tatsuro Yamada

Attachment Content-Type Size
v9-Report-progress-for-ANALYZE.patch text/plain 18.4 KB
testcase.sql text/plain 707 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-11-27 03:54:16 Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly
Previous Message Tatsuro Yamada 2019-11-27 03:14:47 Re: progress report for ANALYZE