Re: [PROPOSAL] VACUUM Progress Checker.

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Vinayak Pokale <vinpokale(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2016-01-28 10:38:16
Message-ID: CAH2L28sJyTG+5UmxFRxsdcQGQUznrQASk0eryHGNrQTSKXi=TA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>+ if(!scan_all)
>+ scanned_heap_pages = scanned_heap_pages +
>next_not_all_visible_block;

>I don't want to be too much of a stickler for details here, but it
>seems to me that this is an outright lie.

Initially the scanned_heap_pages were meant to report just the scanned
pages and skipped pages were not added to the count. Instead the skipped
pages were deduced from number of total heap pages to be scanned to make
the number of scanned pages eventually add up to total heap pages. As per
comments received later total heap pages were kept constant and skipped
pages count was added to scanned pages to make the count add up to total
heap pages at the end of scan. That said, as suggested, scanned_heap_pages
should be renamed to current_heap_page to report current blkno in
lazy_scan_heap loop which will add up to total heap pages(nblocks) at the
end of scan. And scanned_heap_pages can be reported as a separate number
which wont contain skipped pages.

>+ /*
>+ * Reporting vacuum progress to statistics collector
>+ */

>This patch doesn't report anything to the statistics collector, nor should
it.
Yes. This was incorrectly added initially by referring to similar
pgstat_report interface functions.

Thank you,
Rahila Syed

On Thu, Jan 28, 2016 at 2:27 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Tue, Jan 26, 2016 at 11:37 PM, Vinayak Pokale <vinpokale(at)gmail(dot)com>
> wrote:
> > Hi,
> >
> > Please find attached updated patch with an updated interface.
>
> Well, this isn't right. You've got this sort of thing:
>
> + scanned_index_pages += RelationGetNumberOfBlocks(Irel[i]);
> + /* Report progress to the statistics collector */
> + pgstat_report_progress_update_message(0, progress_message);
> + pgstat_report_progress_update_counter(1, scanned_heap_pages);
> + pgstat_report_progress_update_counter(3, scanned_index_pages);
> + pgstat_report_progress_update_counter(4,
> vacrelstats->num_index_scans + 1);
>
> The point of having pgstat_report_progress_update_counter() is so that
> you can efficiently update a single counter without having to update
> everything, when only one counter has changed. But here you are
> calling this function a whole bunch of times in a row, which
> completely misses the point - if you are updating all the counters,
> it's more efficient to use an interface that does them all at once
> instead of one at a time.
>
> But there's a second problem here, too, which is that I think you've
> got this code in the wrong place. The second point of the
> pgstat_report_progress_update_counter interface is that this should be
> cheap enough that we can do it every time the counter changes. That's
> not what you are doing here. You're updating the counters at various
> points in the code and just pushing new values for all of them
> regardless of which ones have changed. I think you should find a way
> that you can update the value immediately at the exact moment it
> changes. If that seems like too much of a performance hit we can talk
> about it, but I think the value of this feature will be greatly
> weakened if users can't count on it to be fully and continuously up to
> the moment. When something gets stuck, you want to know where it's
> stuck, not approximately kinda where it's stuck.
>
> + if(!scan_all)
> + scanned_heap_pages = scanned_heap_pages +
> next_not_all_visible_block;
>
> I don't want to be too much of a stickler for details here, but it
> seems to me that this is an outright lie. The number of scanned pages
> does not go up when we decide to skip some pages, because scanning and
> skipping aren't the same thing. If we're only going to report one
> number here, it needs to be called something like "current heap page",
> and then you can just report blkno at the top of each iteration of
> lazy_scan_heap's main loop. If you want to report the numbers of
> scanned and skipped pages separately that'd be OK too, but you can't
> call it the number of scanned pages and then actually report a value
> that is not that.
>
> + /*
> + * Reporting vacuum progress to statistics collector
> + */
>
> This patch doesn't report anything to the statistics collector, nor should
> it.
>
> Instead of making the SQL-visible function
> pg_stat_get_vacuum_progress(), I think it should be something more
> generic like pg_stat_get_command_progress(). Maybe VACUUM will be the
> only command that reports into that feature for right now, but I'd
> hope for us to change that pretty soon after we get the first patch
> committed.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2016-01-28 10:49:56 Re: Template for commit messages
Previous Message Tomas Vondra 2016-01-28 10:30:58 Re: Template for commit messages