Re: [PROPOSAL] VACUUM Progress Checker.

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2015-06-30 07:52:26
Message-ID: CAFj8pRAYFqY5UBye94cwvsca9Z-rmn9LnaNAQ4mjK4mXgTD2jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

2015-06-30 9:37 GMT+02:00 Rahila Syed <rahilasyed90(at)gmail(dot)com>:

> Hello Hackers,
>
> Following is a proposal for feature to calculate VACUUM progress.
>

interesting idea - I like to see it integrated to core.

>
> Use Case : Measuring progress of long running VACUUMs to help DBAs make
> informed decision
> whether to continue running VACUUM or abort it.
>
> Design:
>
> A shared preload library to store progress information from different
> backends running VACUUM, calculate remaining time for each and display
> progress in the
> in the form a view.
>

probably similar idea can be used for REINDEX, CREATE INDEX, COPY TO
statements

I though about the possibilities of progress visualization - and one
possibility is one or two special column in pg_stat_activity table - this
info can be interesting for VACUUM started by autovacuum too.

Regards

Pavel

>
> VACUUM needs to be instrumented with a hook to collect progress
> information (pages vacuumed/scanned) periodically.
>
> The patch attached implements a new hook to store vacuumed_pages and
> scanned_pages count at the end of each page scanned by VACUUM.
>
> This information is stored in a shared memory structure.
>
> In addition to measuring progress this function using hook also calculates
> remaining time for VACUUM.
>
>
>
> The frequency of collecting progress information can be reduced by
> appending delays in between hook function calls.
>
> Also, a GUC parameter
>
> log_vacuum_min_duration can be used.
>
> This will cause VACUUM progress to be calculated only if VACUUM runs more
> than specified milliseconds.
>
> A value of zero calculates VACUUM progress for each page processed. -1
> disables logging.
>
>
> Progress calculation :
>
>
> percent_complete = scanned_pages * 100 / total_pages_to_be_scanned;
>
> remaining_time = elapsed_time * (total_pages_to_be_scanned -
> scanned_pages) / scanned_pages;
>
>
> Shared memory struct:
>
> typedef struct PgStat_VacuumStats
>
> {
>
> Oid databaseoid;
>
> Oid tableoid;
>
> Int32 vacuumed_pages;
>
> Int32 total_pages;
>
> Int32 scanned_pages;
>
> double elapsed_time;
>
> double remaining_time;
>
> } PgStat_VacuumStats[max_connections];
>
>
>
> Reporting :
>
> A view named 'pg_maintenance_progress' can be created using the values in
> the struct above.
>
> pg_stat_maintenance can be called from any other backend and will display
> progress of
>
> each running VACUUM.
>
>
> Other uses of hook in VACUUM:
>
>
> Cost of VACUUM in terms of pages hit , missed and dirtied same as
> autovacuum can be collected using this hook.
>
> Autovacuum does it at the end of VACUUM for each table. It can be done
> while VACUUM on a table is in progress.
> This can be helpful to track manual VACUUMs also not just autovacuum.
>
> Read/Write(I/O) rates can be computed on the lines of autovacuum.
> Read rate patterns can be used to help tuning future vacuum on the
> table(like shared buffers tuning)
> Other resource usages can also be collected using progress checker hook.
>
>
> Attached patch is POC patch of progress calculation for a single backend.
>
> Also attached is a brief snapshot of the output log.
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2015-06-30 07:59:07 Re: [PROPOSAL] VACUUM Progress Checker.
Previous Message Rahila Syed 2015-06-30 07:37:57 [PROPOSAL] VACUUM Progress Checker.