Re: [PROPOSAL] VACUUM Progress Checker.

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: "Syed, Rahila" <Rahila(dot)Syed(at)nttdata(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Thom Brown <thom(at)linux(dot)com>, "PostgreSQL-development (pgsql-hackers(at)postgresql(dot)org)" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2015-10-02 06:38:42
Message-ID: CAHGQGwGXOUymHH5j5GR2BHs-CbXh+qCU0p2Wt4N--V_8G5gANg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 25, 2015 at 2:03 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> On Wed, Sep 23, 2015 at 12:24 AM, Syed, Rahila <Rahila(dot)Syed(at)nttdata(dot)com> wrote:
>> Hello,
>>
>> Please find attached patch with bugs reported by Thom and Sawada-san solved.
>
> The regression test failed on my machine, so you need to update the
> regression test,
> I think.

Here are another review comments.

You removed some empty lines, for example, in vacuum.h.
Which seems useless to me.

+ uint32 progress_param[N_PROGRESS_PARAM];

Why did you use an array to store the progress information of VACUUM?
I think that it's better to use separate specific variables for them for
better code readability, for example, variables scanned_pages,
heap_total_pages, etc.

+ double progress_param_float[N_PROGRESS_PARAM];

Currently only progress_param_float[0] is used. So there is no need to
use an array here.

progress_param_float[0] saves the percetage of VACUUM progress.
But why do we need to save that information into shared memory?
We can just calculate the percentage whenever pg_stat_get_vacuum_progress()
is executed, instead. There seems to be no need to save that information.

+ char progress_message[PROGRESS_MESSAGE_LENGTH][N_PROGRESS_PARAM];

As Sawada pointed out, there is no user of this variable.

+#define PG_STAT_GET_PROGRESS_COLS 30

Why did you use 30?

+ FROM pg_stat_get_vacuum_progress(NULL) AS S;

You defined pg_stat_get_vacuum_progress() so that it accepts one argument.
But as far as I read the function, it doesn't use any argument at all.
I think that pg_stat_get_vacuum_progress() should be defined as a function
having no argument.

+ /* Report values for only those backends which are running VACUUM */
+ if(!beentry || (strncmp(beentry->st_activity,"VACUUM",6)
+ && strncmp(beentry->st_activity,"vacuum",6)))
+ continue;

This design looks bad to me. There is no guarantee that st_activity of
the backend running VACUUM displays "VACUUM" or "vacuum".
For example, st_activity of autovacuum worker displays "autovacuum ...".
So as Sawada reported, he could not find any entry for autovacuum in
pg_stat_vacuum_progress.

I think that you should add the flag or something which indicates
whether this backend is running VACUUM or not, into PgBackendStatus.
pg_stat_vacuum_progress should display the entries of only backends
with that flag set true. This design means that you need to set the flag
to true when starting VACUUM and reset at the end of VACUUM progressing.

Non-superuser cannot see some columns of the superuser's entry in
pg_stat_activity, for permission reason. We should treat
pg_stat_vacuum_progress in the same way? That is, non-superuser
should not be allowed to see the pg_stat_vacuum_progress entry
of superuser running VACUUM?

+ if(!scan_all)
+ {
+ total_heap_pages = total_heap_pages -
(next_not_all_visible_block - blkno);
+ total_pages = total_pages -
(next_not_all_visible_block - blkno);
+ }

This code may cause total_pages and total_heap_pages to decrease
while VACUUM is running. This sounds strange and confusing. I think
that total values basically should be fixed. And heap_scanned_pages
should increase, instead.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2015-10-02 07:14:42 Re: [PROPOSAL] VACUUM Progress Checker.
Previous Message Kouhei Kaigai 2015-10-02 05:04:44 Re: Foreign join pushdown vs EvalPlanQual