Re: progress report for ANALYZE

From: Tatsuro Yamada <tatsuro(dot)yamada(dot)tf(at)nttcom(dot)co(dot)jp>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: progress report for ANALYZE
Date: 2019-12-19 12:06:50
Message-ID: 173a3e8c-af2a-b87a-5126-97c50c4c38dc@nttcom.co.jp_1
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All,

>> *All* phases are repeated in this case, not not just "finalizing
>> analyze", because ANALYZE repeatedly runs for each partition after the
>> parent partitioned table's ANALYZE finishes.  ANALYZE's documentation
>> mentions that analyzing a partitioned table also analyzes all of its
>> partitions, so users should expect to see the progress information for
>> each partition.  So, I don't think we need to clarify that if only in
>> one phase's description.  Maybe we can add a note after the phase
>> description table which mentions this implementation detail about
>> partitioned tables.  Like this:
>>
>>    <note>
>>     <para>
>>      Note that when <command>ANALYZE</command> is run on a partitioned table,
>>      all of its partitions are also recursively analyzed as also mentioned on
>>      <xref linkend="sql-analyze"/>.  In that case, <command>ANALYZE</command>
>>      progress is reported first for the parent table, whereby its inheritance
>>      statistics are collected, followed by that for each partition.
>>     </para>
>>    </note>
>
>
> Ah.. you are right: All phases are repeated, it shouldn't be fixed
> the only one phase's description.
>
>
>> Some more comments on the documentation:
>>
>> +       Number of computed extended stats.  This counter only advances
>> when the phase
>> +       is <literal>computing extended stats</literal>.
>>
>> Number of computed extended stats -> Number of extended stats computed
>
>
> Will fix.
>
>
>> +       Number of analyzed child tables.  This counter only advances
>> when the phase
>> +       is <literal>computing extended stats</literal>.
>>
>> Regarding, "Number of analyzed child table", note that we don't
>> "analyze" child tables in this phase, only scan its blocks to collect
>> samples for parent's ANALYZE.  Also, the 2nd sentence is wrong -- you
>> meant "when the phase is <literal>acquiring inherited sample
>> rows</literal>.  I suggest to write this as follows:
>>
>> Number of child tables scanned.  This counter only advances when the phase
>> is <literal>acquiring inherited sample rows</literal>.
>
>
> Oops, I will fix it. :)
>
>
>
>> +     <entry>OID of the child table currently being scanned.
>> +       It might be different from relid when analyzing tables that
>> have child tables.
>>
>> I suggest:
>>
>> OID of the child table currently being scanned.  This field is only valid when
>> the phase is <literal>computing extended stats</literal>.
>
>
> Will fix.
>
>
>> +       The command is currently scanning the
>> <structfield>current_relid</structfield>
>> +       to obtain samples.
>>
>> I suggest:
>>
>> The command is currently scanning the the table given by
>> <structfield>current_relid</structfield> to obtain sample rows.
>
>
> Will fix.
>
>
>> +       The command is currently scanning the
>> <structfield>current_child_table_relid</structfield>
>> +       to obtain samples.
>>
>> I suggest (based on phase description pg_stat_progress_create_index
>> phase descriptions):
>>
>> The command is currently scanning child tables to obtain sample rows.  Columns
>> <structfield>child_tables_total</structfield>,
>> <structfield>child_tables_done</structfield>, and
>> <structfield>current_child_table_relid</structfield> contain the progress
>> information for this phase.
>
>
> Will fix.
>
>
>> +    <row>
>> +     <entry><literal>computing stats</literal></entry>
>>
>> I think the phase name should really be "computing statistics", that
>> is, use the full word.
>
>
> Will fix.
>
>
>> +     <entry>
>> +       The command is computing stats from the samples obtained
>> during the table scan.
>> +     </entry>
>> +    </row>
>>
>> So I suggest:
>>
>> The command is computing statistics from the sample rows obtained during
>> the table scan
>
>
> Will fix.
>
>
>> +     <entry><literal>computing extended stats</literal></entry>
>> +     <entry>
>> +       The command is computing extended stats from the samples
>> obtained in the previous phase.
>> +     </entry>
>>
>> I suggest:
>>
>> The command is computing extended statistics from the sample rows obtained
>> during the table scan.
>
>
> Will fix.

I fixed the document based on Amit's comments. :)
Please find attached file.

Thanks,
Tatsuro Yamadas

Attachment Content-Type Size
v11-Report-progress-for-ANALYZE.patch text/plain 20.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-12-19 12:38:19 Re: Read Uncommitted
Previous Message Amit Kapila 2019-12-19 11:05:06 Re: [HACKERS] Block level parallel vacuum