Re: [PROPOSAL] VACUUM Progress Checker.

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pokurev(at)pm(dot)nttdata(dot)co(dot)jp, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, bannos(at)nttdata(dot)co(dot)jp
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2016-03-12 12:49:23
Message-ID: CA+HiwqGMrT0DeJ6Mduw4qOAnZnATQLLrtsEmYXjc1GosHzQvQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 11, 2016 at 2:31 PM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2016/03/11 13:16, Robert Haas wrote:
>> On Thu, Mar 10, 2016 at 9:04 PM, Amit Langote
>> <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>> So, from what I understand here, we should not put total count of index
>>> pages into st_progress_param; rather, have the client (reading
>>> pg_stat_progress_vacuum) derive it using pg_indexes_size() (?), as and
>>> when necessary. However, only server is able to tell the current position
>>> within an index vacuuming round (or how many pages into a given index
>>> vacuuming round), so report that using some not-yet-existent mechanism.
>>
>> Isn't that mechanism what you are trying to create in 0003?
>
> Right, 0003 should hopefully become that mechanism.

About 0003:

Earlier, it was trying to report vacuumed index block count using
lazy_tid_reaped() callback for which I had added a index_blkno
argument to IndexBulkDeleteCallback. Turns out it's not such a good
place to do what we are trying to do. This callback is called for
every heap pointer in an index. Not all index pages contain heap
pointers, which means the existing callback does not allow to count
all the index blocks that AM would read to finish a given index vacuum
run.

Instead, the attached patch adds a IndexBulkDeleteProgressCallback
which AMs should call for every block that's read (say, right before a
call to ReadBufferExtended) as part of a given vacuum run. The
callback with help of some bookkeeping state can count each block and
report to pgstat_progress API. Now, I am not sure if all AMs read 1..N
blocks for every vacuum or if it's possible that some blocks are read
more than once in single vacuum, etc. IOW, some AM's processing may
be non-linear and counting blocks 1..N (where N is reported total
index blocks) may not be possible. However, this is the best I could
think of as doing what we are trying to do here. Maybe index AM
experts can chime in on that.

Thoughts?

Thanks,
Amit

Attachment Content-Type Size
0001-WIP-Implement-progress-reporting-for-VACUUM-command-v11.patch application/octet-stream 32.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mithun Cy 2016-03-12 13:41:12 Re: Explain [Analyze] produces parallel scan for select Into table statements.
Previous Message Haribabu Kommi 2016-03-12 12:17:46 Re: pam auth - add rhost item