Re: monitoring CREATE INDEX [CONCURRENTLY]

From: Rahila Syed <rahilasyed90(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: monitoring CREATE INDEX [CONCURRENTLY]
Date: 2019-01-18 14:07:19
Message-ID: CAH2L28v_rWhBxnOM2di30=7GSmx3RSDHWfS-aa7hXYvfesrZCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Alvaro,

The WIP patch needs a rebase. Please see few in-line comments.

On Fri, Dec 21, 2018 at 3:30 AM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> Monitoring progress of CREATE INDEX [CONCURRENTLY] is sure to be welcome,
> so here's a proposal.
>
> There are three distinct interesting cases. One is straight CREATE
> INDEX of a standalone table; then we have CREATE INDEX CONCURRENTLY;
> finally, CREATE INDEX on a partitioned table. Note that there's no
> CONCURRENTLY for partitioned tables.
>
> A non-concurrent build is a very straightforward: we call create_index,
> which does index_build, done. See below for how to report for
> index_build, which is the interesting part. I propose not to report
> anything else than that for non-concurrent build. There's some
> preparatory work that's identical than for CIC (see below). Like
> VACUUM, it seems a bit pointless to report an initial phase that's
> almost immediate, so I propose we just don't report anything until the
> actual index building starts.

Aren't we reporting this initial preparatory work in the form of
'initializing' phase that you
have in current patch? IIUC, there are no metrics but the name of the phase.

CREATE INDEX CONCURRENTLY does these things first, which we would not
> report (this is just like VACUUM, which only starts reporting once it
> starts scanning blocks):
>
> a. lock rel. No metrics to report.
> b. other prep; includes lots of catalog access. Unlikely to lock, but
> not impossible. No metrics to report.
> c. create_index. CIC skips index_build here, so there's no reason to
> report it either.
>
> We would start reporting at this point, with these phases:
>
> 1. WaitForLockers 1. Report how many xacts do we need to wait for, how
> many are done.
> 2. index_build. See below.
> 3. WaitForLockers 2. Report how many xacts do we need to wait for, how
> many are done.
> 4. validate_index. Scans the whole rel again. Report number of blocks
> scanned.
> 5. wait for virtual XIDs. Like WaitForLockers: report how many xacts we
> need to wait for, how many are done.
>
> We're done.
>
> (Alternatively, we could have an initial "prep" phase for a/b/c for the
> concurrent case and a/b for non-concurrent. I'm just not sure it's
> useful.)

> index_build
> -----------
>
> The actual index building is an AM-specific undertaking, and we report
> its progress separately from the AM-agnostic code. That is, each AM has
> freedom to define its own list of phases and counters, separate from the
> generic code. This avoids the need to provide a new AM method or invoke
> callbacks. So when you see that CREATE_INDEX_PHASE is either "index
> build" you'll have a separate BTREE_CREATE_PHASE value set to either
> "scanning heap" or "sorting" or "building upper layers"; equivalently
> for other AMs.
>
> OK.
I think the main phases in which index_build for most AMs can be divided is
as follows:
1. Scanning heap tuples for building index which is common for all AMs
2. Forming index tuples which is AM-specific
3. Writing tuples into the index which is AM-specific.
Out of these, metrics for phase 1 can be heap_tuples_scanned /
total_heap_tuples and
for phase 3, it can be index_tuples_computed/ total_index_tuples.
I am not sure about metrics for phase 2 especially for Btree which involves
reporting progress of sorting.

Partitioned indexes
> -------------------
>
> For partitioned indexes, we only have the index build phase, but we
> repeat it for each partition. In addition to the index_build metrics
> described above, we should report how many partitions we need to handle
> in total and how many partitions are already done. (I'm avoiding
> getting in the trouble of reporting *which* partition we're currently
> handling and have already handled.)
>
> Thoughts?
>
> +CREATE VIEW pg_stat_progress_create_index AS
+ SELECT
+ s.pid AS pid, S.datid AS datid, D.datname AS datname,
+ S.relid AS relid,
+ CASE s.param1 WHEN 0 THEN 'initializing'
+ WHEN 1 THEN 'waiting for lockers 1'
+ WHEN 2 THEN 'building index'
+ WHEN 3 THEN 'waiting for lockers 2'
+ WHEN 4 THEN 'validating index'
+ WHEN 5 THEN 'waiting for lockers 3'
+ END as phase,
+ S.param2 AS procs_to_wait_for,
+ S.param3 AS procs_waited_for,
+ S.param4 AS partitions_to_build,
+ S.param5 AS partitions_built
+ FROM pg_stat_get_progress_info('CREATE INDEX') AS S
+ LEFT JOIN pg_database D ON S.datid = D.oid;
+

1. In the above code, I think it will be useful if we report phases as
'Initializing phase 1 of n'
'Waiting for lockers phase 2 of n' etc. i.e reporting total number of
phases as well.

+ holders = lappend(holders,
+ GetLockConflicts(locktag,
lockmode, &count));
+ total += count;
2. IIUC, the above code in WaitForLockersMultiple can be written under
if(progress) condition like rest of the progress checking code in that
function
and pass NULL for count otherwise.

3. Will it be useful to report pid's of the backend's
for the transactions which CREATE INDEX concurrently is waiting for?
I think it can be used to debug long running transactions.

4. Should we have additional statistics update phase before
index_update_stats
like PROGRESS_VACUUM_PHASE_FINAL_CLEANUP?

5. I think it may be useful to report number of parallel workers requested
and number of workers
actually running index build in case of btree.

6. Also, this can be reported as an additional validation phase for
exclusion constraint
in index_build function as it involves scanning all live tuples of heap
relation.

/*
* If it's for an exclusion constraint, make a second pass over the
heap
* to verify that the constraint is satisfied. We must not do this
until
* the index is fully valid. (Broken HOT chains shouldn't matter,
though;
* see comments for IndexCheckExclusion.)
*/
if (indexInfo->ii_ExclusionOps != NULL)
IndexCheckExclusion(heapRelation, indexRelation, indexInfo);
*/

Thank you,
Rahila Syed

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2019-01-18 14:34:35 Re: pg_dump multi VALUES INSERT
Previous Message Adrien NAYRAT 2019-01-18 13:22:50 Re: Log a sample of transactions