Re: [PROPOSAL] VACUUM Progress Checker.

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, pokurev(at)pm(dot)nttdata(dot)co(dot)jp, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, bannos(at)nttdata(dot)co(dot)jp
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2016-03-16 00:56:31
Message-ID: 56E8AF3F.6020508@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016/03/16 2:33, Robert Haas wrote:
> On Tue, Mar 15, 2016 at 1:16 AM, Amit Langote
> <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> On 2016/03/15 3:41, Robert Haas wrote:
>>> Well, I think you need to study the index AMs and figure this out.
>>
>> OK. I tried to put calls to the callback in appropriate places, but
>> couldn't get the resulting progress numbers to look sane. So I ended up
>> concluding that any attempt to do so is futile unless I analyze each AM's
>> vacuum code carefully to be able to determine in advance the max bound on
>> the count of blocks that the callback will report. Anyway, as you
>> suggest, we can improve it later.
>
> I don't think there is any way to bound that, because new blocks can
> get added to the index concurrently, and we might end up needing to
> scan them. Reporting the number of blocks scanned can still be
> useful, though - any chance you can just implement that part of it?

Do you mean the changes I made to index bulk delete API for this purpose
in last few versions of this patch? With it, I have observed that
reported scanned blocks count (that is incremented for every
ReadBufferExtended() call across a index vacuum cycle using the new
callback) can be non-deterministic. Would there be an accompanying
index_blocks_total (pg_indexes_size()-based) in the view, as well?

>>> But I think for starters you should write a patch that reports the following:
>>>
>>> 1. phase
>>> 2. number of heap blocks scanned
>>> 3. number of heap blocks vacuumed
>>> 4. number of completed index vac cycles
>>> 5. number of dead tuples collected since the last index vac cycle
>>> 6. number of dead tuples that we can store before needing to perform
>>> an index vac cycle
>>>
>>> All of that should be pretty straightforward, and then we'd have
>>> something we can ship. We can add the detailed index reporting later,
>>> when we get to it, perhaps for 9.7.
>>
>> OK, I agree with this plan. Attached updated patch implements this.
>
> Sorta. Committed after renaming what you called heap blocks vacuumed
> back to heap blocks scanned, adding heap blocks vacuumed, removing the
> overall progress meter which I don't believe will be anything close to
> accurate, fixing some stylistic stuff, arranging to update multiple
> counters automatically where it could otherwise produce confusion,
> moving the new view near similar ones in the file, reformatting it to
> follow the style of the rest of the file, exposing the counter
> #defines via a header file in case extensions want to use them, and
> overhauling and substantially expanding the documentation.

Thanks a lot for committing this. The committed version is much better.
Some of the things therein should really have been part of the final
*submitted* patch; I will try to improve in that area in my submissions
henceforth.

Thanks,
Amit

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-03-16 01:08:29 Re: Parallel Aggregate
Previous Message David Rowley 2016-03-16 00:55:53 Re: Parallel Aggregate