Re: [PROPOSAL] VACUUM Progress Checker.

From: "Syed, Rahila" <Rahila(dot)Syed(at)nttdata(dot)com>
To: Thom Brown <thom(at)linux(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2015-06-30 13:12:43
Message-ID: C3C878A2070C994B9AE61077D46C3846881527E1@MAIL703.KDS.KEANE.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

>There's no need to add those curly braces, or to subsequent if blocks
Yes, those are added by mistake.

>Also, is this patch taking the visibility map into account for its calculations?
Yes, it subtracts skippable/all-visible pages from total pages to be scanned.
For each page processed by lazy_scan_heap, if number of all visible pages ahead exceeds the threshold, it is subtracted from
the ‘total pages to be scanned’ count.

The all visible pages are accounted for incrementally during the execution of VACUUM and not before starting the process.

Thank you,
Rahila Syed

From: pgsql-hackers-owner(at)postgresql(dot)org [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Thom Brown
Sent: Tuesday, June 30, 2015 2:20 PM
To: Rahila Syed
Cc: PostgreSQL-development
Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

On 30 June 2015 at 08:37, Rahila Syed <rahilasyed90(at)gmail(dot)com<mailto:rahilasyed90(at)gmail(dot)com>> wrote:
Hello Hackers,

Following is a proposal for feature to calculate VACUUM progress.

Use Case : Measuring progress of long running VACUUMs to help DBAs make informed decision
whether to continue running VACUUM or abort it.

Design:

A shared preload library to store progress information from different backends running VACUUM, calculate remaining time for each and display progress in the
in the form a view.

VACUUM needs to be instrumented with a hook to collect progress information (pages vacuumed/scanned) periodically.
The patch attached implements a new hook to store vacuumed_pages and scanned_pages count at the end of each page scanned by VACUUM.
This information is stored in a shared memory structure.
In addition to measuring progress this function using hook also calculates remaining time for VACUUM.

The frequency of collecting progress information can be reduced by appending delays in between hook function calls.
Also, a GUC parameter
log_vacuum_min_duration can be used.
This will cause VACUUM progress to be calculated only if VACUUM runs more than specified milliseconds.
A value of zero calculates VACUUM progress for each page processed. -1 disables logging.

Progress calculation :

percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;

remaining_time = elapsed_time * (total_pages_to_be_scanned - scanned_pages) / scanned_pages;

Shared memory struct:

typedef struct PgStat_VacuumStats

{

Oid databaseoid;

Oid tableoid;

Int32 vacuumed_pages;

Int32 total_pages;

Int32 scanned_pages;

double elapsed_time;

double remaining_time;

} PgStat_VacuumStats[max_connections];

Reporting :
A view named 'pg_maintenance_progress' can be created using the values in the struct above.
pg_stat_maintenance can be called from any other backend and will display progress of
each running VACUUM.

Other uses of hook in VACUUM:

Cost of VACUUM in terms of pages hit , missed and dirtied same as autovacuum can be collected using this hook.
Autovacuum does it at the end of VACUUM for each table. It can be done while VACUUM on a table is in progress.
This can be helpful to track manual VACUUMs also not just autovacuum.

Read/Write(I/O) rates can be computed on the lines of autovacuum.
Read rate patterns can be used to help tuning future vacuum on the table(like shared buffers tuning)
Other resource usages can also be collected using progress checker hook.

Attached patch is POC patch of progress calculation for a single backend.
Also attached is a brief snapshot of the output log.

@@ -559,7 +567,9 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
* following blocks.
*/
if (next_not_all_visible_block - blkno > SKIP_PAGES_THRESHOLD)
+ {
skipping_all_visible_blocks = true;
+ }
There's no need to add those curly braces, or to subsequent if blocks.
Also, is this patch taking the visibility map into account for its calculations?

--
Thom

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2015-06-30 13:30:24 Re: RFC: replace pg_stat_activity.waiting with something more descriptive
Previous Message Pavel Stehule 2015-06-30 13:04:26 Re: Exposing PG_VERSION_NUM in pg_config