Re: [PROPOSAL] VACUUM Progress Checker.

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, pokurev(at)pm(dot)nttdata(dot)co(dot)jp, "Robert Haas (robertmhaas(at)gmail(dot)com)" <robertmhaas(at)gmail(dot)com>
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2015-11-30 12:49:49
Message-ID: CAH2L28sqDDt-aoTpzp1ma8s=1WiqRW3QEobBcjJQ7e2RbKhcDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

Thank you for your comments.
Please find attached patch addressing following comments ,

>- duplicate_oids error in HEAD.
Check.

>- a compiler warning:
>pgstat.c:2898: warning: no previous prototype for
‘pgstat_reset_activityflag’
Check.

>One more change you could do is 's/activityflag/activity_flag/g',
Check.

>Type of the flag of vacuum activity.
The flag variable is an integer to incorporate more commands in future.

>Type of st_progress_param and so.
st_progress_param is also given a generic name to incorporate different
parameters reported from various commands.

>st_progress_param_float is currently totally useless.
Float parameter has currently been removed from the patch.

>Definition of progress_message.
>The definition of progress_message in lazy_scan_heap is "char
>[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM]" which looks to be
>inversed.
Corrected.

>The current code subtracts the number of blocks when
>skipping_all_visible_blocks is set in two places. But I think
>it is enough to decrement when skipping.
In both the places, the pages are being skipped hence the total count was
decremented.

>He suggested to keep total_heap_pages fixed while adding number
>of skipped pages to that of scanned pages.
This has been done in the attached.

>snprintf(progress_message[0], PROGRESS_MESSAGE_LENGTH, "%s.%s",
> get_namespace_name(RelationGetNamespace(rel)),
> relname);
Check.

The previous implementation used to add total number of pages across all
indexes of a relation to total_index_pages in every scan of
indexes to account for total pages scanned. Thus, it was equal to number
of scans * total_index_pages.

In the attached patch, total_index_pages reflects total number of pages
across all indexes of a relation.
And the column to report passes through indexes (phase 2) has been added to
account for number of passes for index and heap vacuuming.
Number of scanned index pages is reset at the end of each pass.
This makes the reporting clearer.
The percent complete does not account for index pages. It just denotes
percentage of heap scanned.

>Spotted a potential oversight regarding report of scanned_pages. It
>seems pages that are skipped because of not getting a pin, being new,
>being empty could be left out of the progress equation.
Corrected.

>It's better to
>report that some other way, like use one of the strings to report a
>"phase" of processing that we're currently performing.
Has been included in the attached.

Some more comments need to be addressed which include name change of
activity flag, reporting only changed parameters to shared memory,
ACTIVITY_IS_VACUUM flag being set unnecessarily for ANALYZE and FULL
commands ,documentation for new view.
Also, finer grain reporting from indexes and heap truncate phase is yet to
be incorporated into the patch

Thank you,
Rahila Syed

Attachment Content-Type Size
Vacuum_progress_checker_v7.patch text/x-patch 18.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-11-30 13:36:34 Re: Proposal: Trigonometric functions in degrees
Previous Message Kyotaro HORIGUCHI 2015-11-30 12:47:34 [PoC] Asynchronous execution again (which is not parallel)