Re: [PROPOSAL] VACUUM Progress Checker.

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, pokurev(at)pm(dot)nttdata(dot)co(dot)jp, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, bannos(at)nttdata(dot)co(dot)jp
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2016-03-14 18:41:27
Message-ID: CA+TgmoahtCo+Caj2n9=fOgiZqngfU+rGUSawiNdSGvTTbEXS=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 12, 2016 at 7:49 AM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Fri, Mar 11, 2016 at 2:31 PM, Amit Langote
> <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> On 2016/03/11 13:16, Robert Haas wrote:
>>> On Thu, Mar 10, 2016 at 9:04 PM, Amit Langote
>>> <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>>> So, from what I understand here, we should not put total count of index
>>>> pages into st_progress_param; rather, have the client (reading
>>>> pg_stat_progress_vacuum) derive it using pg_indexes_size() (?), as and
>>>> when necessary. However, only server is able to tell the current position
>>>> within an index vacuuming round (or how many pages into a given index
>>>> vacuuming round), so report that using some not-yet-existent mechanism.
>>>
>>> Isn't that mechanism what you are trying to create in 0003?
>>
>> Right, 0003 should hopefully become that mechanism.
>
> About 0003:
>
> Earlier, it was trying to report vacuumed index block count using
> lazy_tid_reaped() callback for which I had added a index_blkno
> argument to IndexBulkDeleteCallback. Turns out it's not such a good
> place to do what we are trying to do. This callback is called for
> every heap pointer in an index. Not all index pages contain heap
> pointers, which means the existing callback does not allow to count
> all the index blocks that AM would read to finish a given index vacuum
> run.
>
> Instead, the attached patch adds a IndexBulkDeleteProgressCallback
> which AMs should call for every block that's read (say, right before a
> call to ReadBufferExtended) as part of a given vacuum run. The
> callback with help of some bookkeeping state can count each block and
> report to pgstat_progress API. Now, I am not sure if all AMs read 1..N
> blocks for every vacuum or if it's possible that some blocks are read
> more than once in single vacuum, etc. IOW, some AM's processing may
> be non-linear and counting blocks 1..N (where N is reported total
> index blocks) may not be possible. However, this is the best I could
> think of as doing what we are trying to do here. Maybe index AM
> experts can chime in on that.
>
> Thoughts?

Well, I think you need to study the index AMs and figure this out.

But I think for starters you should write a patch that reports the following:

1. phase
2. number of heap blocks scanned
3. number of heap blocks vacuumed
4. number of completed index vac cycles
5. number of dead tuples collected since the last index vac cycle
6. number of dead tuples that we can store before needing to perform
an index vac cycle

All of that should be pretty straightforward, and then we'd have
something we can ship. We can add the detailed index reporting later,
when we get to it, perhaps for 9.7.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-03-14 18:46:26 Re: [PATCH] Obsolete wording in PL/Perl comment
Previous Message Thomas Munro 2016-03-14 18:38:39 Re: Proposal: "Causal reads" mode for load balancing reads without stale data