Re: ANALYZE command progress checker

From: vinayak <Pokale_Vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp>
To: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Cc: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, David Steele <david(at)pgmasters(dot)net>, David Fetter <david(at)fetter(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ANALYZE command progress checker
Date: 2017-03-24 07:41:34
Message-ID: a31277a9-2bfe-e966-c574-715302521162@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2017/03/23 15:04, Haribabu Kommi wrote:
>
>
> On Wed, Mar 22, 2017 at 8:11 PM, vinayak
> <Pokale_Vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp
> <mailto:Pokale_Vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp>> wrote:
>
>
> On 2017/03/21 21:25, Haribabu Kommi wrote:
>>
>>
>> On Tue, Mar 21, 2017 at 3:41 PM, vinayak
>> <Pokale_Vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp
>> <mailto:Pokale_Vinayak_q3(at)lab(dot)ntt(dot)co(dot)jp>> wrote:
>>
>> Thank you for testing the patch on Windows platform.
>>
>>
>> Thanks for the updated patch.
>>
>> It works good for a normal relation. But for a relation that
>> contains child tables,
>> the PROGRESS_ANALYZE_NUM_ROWS_SAMPLED produces wrong results.
>>
> Thank you for reviewing the patch.
> The attached patch implements a way to report sample rows count
> from acquire_sample_rows() even if called for child tables.
>> How about adding another phase called
>> PROGRESS_ANALYZE_PHASE_COLLECT_INHERIT_SAMPLE_ROWS
>> and set this phase only when it is an inheritance analyze
>> operation. And adding
>> some explanation of ROWS_SAMPLED phase about inheritance tables
>> how these sampled rows are calculated will provide good analyze
>> progress of
>> relation that contains child relations also.
> I have added the phase called
> PROGRESS_ANALYZE_PHASE_COLLECT_INH_SAMPLE_ROWS.
> I have also updated the documentation.
>
>
> Thanks for the updated patch. I will check it.
>
> The ANALYZE command takes long time in computing statistics
> phase.So I think we can add some column or phase so that user can
> easily understand the progress.
> How about adding new column like "num_rows_processed" will compute
> the statistics of specified column?
>
>
> I prefer a column with rows processed instead of a phase.
> Because we already set the phase of compute stats and showing
> the progress there will number of rows processed will be good.
>
> How about separate the computing "inheritance statistics" phase
> from the computing regular "single table" statistics.
> Comment?
>
>
> Yes, this will be good to show both that states of inheritance of
> sampled rows and
> compute inheritance stats, so that it will be clearly visible to the
> user the current
> status.
I have updated the patch.
I have added column "num_rows_processed" and phase "computing inherited
statistics" in the view.

=# \d+ pg_stat_progress_analyze
View "pg_catalog.pg_stat_progress_analyze"
Column | Type | Collation | Nullable | Default |
Storage | Description
------------------------+---------+-----------+----------+---------+----------+-------------
pid | integer | | | |
plain |
datid | oid | | | |
plain |
datname | name | | | |
plain |
relid | oid | | | |
plain |
phase | text | | | |
extended |
num_target_sample_rows | bigint | | | |
plain |
num_rows_sampled | bigint | | | |
plain |
num_rows_processed | bigint | | | |
plain |
View definition:
SELECT s.pid,
s.datid,
d.datname,
s.relid,
CASE s.param1
WHEN 0 THEN 'initializing'::text
WHEN 1 THEN 'collecting sample rows'::text
WHEN 2 THEN 'collecting inherited sample rows'::text
WHEN 3 THEN 'computing statistics'::text
WHEN 4 THEN 'computing inherited statistics'::text
ELSE NULL::text
END AS phase,
s.param2 AS num_target_sample_rows,
s.param3 AS num_rows_sampled,
s.param4 AS num_rows_processed
FROM pg_stat_get_progress_info('ANALYZE'::text) s(pid, datid, relid,
param1, param2, param3, param4, param5, param6, param7, param8, param9,
param10)
LEFT JOIN pg_database d ON s.datid = d.oid;

Regards,
Vinayak Pokale
NTT Open Source Software Center

Attachment Content-Type Size
pg_stat_progress_analyze_v6.patch binary/octet-stream 23.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rajkumar Raghuwanshi 2017-03-24 07:41:54 Re: Parallel Append implementation
Previous Message Ashutosh Bapat 2017-03-24 07:14:03 Re: [Proposal] Make the optimiser aware of partitions ordering