Re: [PROPOSAL] VACUUM Progress Checker.

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, pokurev(at)pm(dot)nttdata(dot)co(dot)jp, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, bannos(at)nttdata(dot)co(dot)jp
Subject: Re: [PROPOSAL] VACUUM Progress Checker.
Date: 2016-03-08 08:02:24
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2016/03/07 23:48, Robert Haas wrote:
> On Sun, Mar 6, 2016 at 11:02 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> The 0001-P.. adds the following interface functions.
>> +extern void pgstat_progress_set_command(BackendCommandType cmdtype);
>> +extern void pgstat_progress_set_command_target(Oid objid);
>> +extern void pgstat_progress_update_param(int index, uint32 val);
>> +extern void pgstat_reset_local_progress(void);
>> +extern int pgstat_progress_get_num_param(BackendCommandType cmdtype);
>> I don't like to treat the target object id differently from other
>> parameters. It could not be needed at all, or could be needed two
>> or more in contrast. Although oids are not guaranteed to fit
>> uint32, we have already stored BlockNumber there.
> Well...
> There's not much point in deciding that the parameters are uint32,
> because we don't have that type at the SQL level.
> pgstat_progress_update_param() really ought to take either int32 or
> int64 as an argument, because that's what we can actually handle from
> SQL, and it seems pretty clear that int64 is better since otherwise we
> can't fit, among other things, a block number.
> Given that, I tend to think that treating the command target specially
> and passing that as an OID is reasonable. We're not going to be able
> to pass variable-sized arrays through this mechanism, ever, because
> our shared memory segment doesn't work like that. And it seems to me
> that nearly every command somebody might want to report progress on
> will touch, basically, one relation a a time. So I don't see the harm
> in hardcoding that idea into the facility.

Updated versions attached.

* changed st_progress_param to int64 and so did the argument of
pgstat_progress_update_param(). Likewise changed param1..param10 of
pg_stat_get_progress_info()'s output columns to bigint.

* Added back the Oid field st_command_target and corresponding function

* I attempted to implement a method to report index blocks done from
lazy_tid_reaped() albeit with some limitations. Patch 0003 is that
attempt. In summary, I modified the index bulk delete callback interface
to receive a BlockNumber argument index_blkno:

/* Typedef for callback function to determine if a tuple is bulk-deletable */
-typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr, void *state);
+typedef bool (*IndexBulkDeleteCallback) (ItemPointer itemptr,
+ BlockNumber index_blkno,
+ void *state);

Then added 2 more fields to LVRelStats:

@@ -143,6 +143,8 @@ typedef struct LVRelStats
int num_index_scans;
TransactionId latestRemovedXid;
bool lock_waiter_detected;
+ BlockNumber last_index_blkno;
+ BlockNumber index_blks_vacuumed;

Then in lazy_tid_reaped(), if the index block number received in the
index_blkno argument has changed from the previous call, increment the
count of index blocks processed and
pgstat_report_update_param(index_blks_done). I wonder if we should reset
the the saved block number and the count for every index vacuumed by
lazy_vacuum_index(). Right now, total_index_blks counts all indexes and
counting blocks using the rough method mentioned above is sensible only
for one index at time. Actually, the method has different kinds of
problems to deal with anyway. For example, with a btree index, one can
expect that the final count does not match total_index_blks obtained using
RelationGetNumberOfBlocks(). Moreover, each AM has its own idiosyncratic
way of traversing the index pages. I dared only touch the btree case to
make it pass current block number to the callback. It finishes with
index_blks_done << total_index_blks since I guess the callback is called
only on the leaf pages. Any ideas?

* I am also tempted to add num_dead_tuples and dead_tuples_vacuumed to add
granularity to 'vacuuming heap' phase but didn't in this version. Should we?


Attachment Content-Type Size
0001-Provide-a-way-for-utility-commands-to-report-progres-v4.patch text/x-diff 13.0 KB
0002-Implement-progress-reporting-for-VACUUM-command-v4.patch text/x-diff 14.0 KB
0003-Add-a-block-number-argument-to-index-bulk-delete-cal-v4.patch text/x-diff 9.3 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2016-03-08 08:28:15 Re: checkpointer continuous flushing - V18
Previous Message Fabien COELHO 2016-03-08 07:33:09 Re: checkpointer continuous flushing - V18