Re: [PROPOSAL] VACUUM Progress Checker.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: pokurev(at)pm(dot)nttdata(dot)co(dot)jp
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, bannos(at)nttdata(dot)co(dot)jp
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2016-03-04 22:11:16
Message-ID: CA+Tgmob5hAazOqcerW9Co50HCOajBf+gQjLqoExkTVSkYP7AxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 26, 2016 at 3:28 AM, <pokurev(at)pm(dot)nttdata(dot)co(dot)jp> wrote:
> Thank you for your comments.
> Please find attached patch addressing following comments.
>
>>As I might have written upthread, transferring the whole string
>>as a progress message is useless at least in this scenario. Since
>>they are a set of fixed messages, each of them can be represented
>>by an identifier, an integer number. I don't see a reason for
>>sending the whole of a string beyond a backend.
> Agreed. I used following macros.
> #define VACUUM_PHASE_SCAN_HEAP 1
> #define VACUUM_PHASE_VACUUM_INDEX_HEAP 2
>
>>I guess num_index_scans could better be reported after all the indexes are
>>done, that is, after the for loop ends.
> Agreed. I have corrected it.
>
>> CREATE VIEW pg_stat_vacuum_progress AS
>> SELECT S.s[1] as pid,
>> S.s[2] as relid,
>> CASE S.s[3]
>> WHEN 1 THEN 'Scanning Heap'
>> WHEN 2 THEN 'Vacuuming Index and Heap'
>> ELSE 'Unknown phase'
>> END,
>> ....
>> FROM pg_stat_get_command_progress(PROGRESS_COMMAND_VACUUM) as S;
>>
>> # The name of the function could be other than *_command_progress.
> The name of function is updated as pg_stat_get_progress_info() and also updated the function.
> Updated the pg_stat_vacuum_progress view as suggested.

I'm positive I've said this at least once before while reviewing this
patch, and I think more than once: we should be trying to build a
general progress-reporting facility here with vacuum as the first
user. Therefore, for example, pg_stat_get_progress_info's output
columns should have generic names, not names specific to VACUUM.
pg_stat_vacuum_progress can alias them to a vacuum-specific name. See
for example the relationship between pg_stats and pg_statistic.

I think VACUUM should have three phases, not two. lazy_vacuum_index()
and lazy_vacuum_heap() are lumped together right now, but I think they
shouldn't be.

Please create named constants for the first argument to
pgstat_report_progress_update_counter(), maybe with names like
PROGRESS_VACUUM_WHATEVER.

+ /* Update current block number of the relation */
+ pgstat_report_progress_update_counter(2, blkno + 1);

Why + 1?

I thought we had a plan to update the counter of scanned index pages
after each index page was vacuumed by the AM. Doing it only after
vacuuming the entire index is much less granular and generally less
useful. See http://www.postgresql.org/message-id/56500356.4070101@BlueTreble.com

+ if (blkno == nblocks - 1 &&
vacrelstats->num_dead_tuples == 0 && nindexes != 0
+ && vacrelstats->num_index_scans == 0)
+ total_index_pages = 0;

I'm not sure what this is trying to do, perhaps because there is no
comment explaining it. Whatever the intent, I suspect that such a
complex test is likely to be fragile. Perhaps there is a better way?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-03-04 22:21:57 Re: Equivalent of --enable-tap-tests in MSVC scripts
Previous Message Julien Rouhaud 2016-03-04 21:52:03 Re: Publish autovacuum informations