Re: [PROPOSAL] VACUUM Progress Checker.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, pokurev(at)pm(dot)nttdata(dot)co(dot)jp, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, bannos(at)nttdata(dot)co(dot)jp
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2016-03-15 17:33:34
Message-ID: CA+TgmoYb07M8MN1p5eHm9-7f3m8HxUK+y893_AqKDfngsLV4Jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 15, 2016 at 1:16 AM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2016/03/15 3:41, Robert Haas wrote:
>> On Sat, Mar 12, 2016 at 7:49 AM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>>> Instead, the attached patch adds a IndexBulkDeleteProgressCallback
>>> which AMs should call for every block that's read (say, right before a
>>> call to ReadBufferExtended) as part of a given vacuum run. The
>>> callback with help of some bookkeeping state can count each block and
>>> report to pgstat_progress API. Now, I am not sure if all AMs read 1..N
>>> blocks for every vacuum or if it's possible that some blocks are read
>>> more than once in single vacuum, etc. IOW, some AM's processing may
>>> be non-linear and counting blocks 1..N (where N is reported total
>>> index blocks) may not be possible. However, this is the best I could
>>> think of as doing what we are trying to do here. Maybe index AM
>>> experts can chime in on that.
>>>
>>> Thoughts?
>>
>> Well, I think you need to study the index AMs and figure this out.
>
> OK. I tried to put calls to the callback in appropriate places, but
> couldn't get the resulting progress numbers to look sane. So I ended up
> concluding that any attempt to do so is futile unless I analyze each AM's
> vacuum code carefully to be able to determine in advance the max bound on
> the count of blocks that the callback will report. Anyway, as you
> suggest, we can improve it later.

I don't think there is any way to bound that, because new blocks can
get added to the index concurrently, and we might end up needing to
scan them. Reporting the number of blocks scanned can still be
useful, though - any chance you can just implement that part of it?

>> But I think for starters you should write a patch that reports the following:
>>
>> 1. phase
>> 2. number of heap blocks scanned
>> 3. number of heap blocks vacuumed
>> 4. number of completed index vac cycles
>> 5. number of dead tuples collected since the last index vac cycle
>> 6. number of dead tuples that we can store before needing to perform
>> an index vac cycle
>>
>> All of that should be pretty straightforward, and then we'd have
>> something we can ship. We can add the detailed index reporting later,
>> when we get to it, perhaps for 9.7.
>
> OK, I agree with this plan. Attached updated patch implements this.

Sorta. Committed after renaming what you called heap blocks vacuumed
back to heap blocks scanned, adding heap blocks vacuumed, removing the
overall progress meter which I don't believe will be anything close to
accurate, fixing some stylistic stuff, arranging to update multiple
counters automatically where it could otherwise produce confusion,
moving the new view near similar ones in the file, reformatting it to
follow the style of the rest of the file, exposing the counter
#defines via a header file in case extensions want to use them, and
overhauling and substantially expanding the documentation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2016-03-15 17:35:12 Re: Soliciting Feedback on Improving Server-Side Programming Documentation
Previous Message Tom Lane 2016-03-15 17:33:07 Re: Small patch: fix warnings during compilation on FreeBSD