Re: monitoring CREATE INDEX [CONCURRENTLY]

From: Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: monitoring CREATE INDEX [CONCURRENTLY]
Date: 2019-03-11 09:02:27
Message-ID: CAOajBXSCXaGEk21oiZiOk=76jMXfhVdnTwJ=Xp5-N8o++vZTqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Alvaro,

On Tue, 5 Mar 2019 at 08:32, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> Hi Rahila,
>
> Thanks for looking.
>
> On 2019-Mar-04, Rahila wrote:
>
> > 1. Thank you for incorporating review comments.
> > Can you please rebase the latest 0001-Report-progress-of-
> > CREATE-INDEX-operations.patch on master? Currently it does not apply on
> > 754b90f657bd54b482524b73726dae4a9165031c
>
> Hmm, rebased to current master. Didn't conflict at all when rebasing,
> so it's strange that it didn't apply.

Thanks for updating the patch. Sorry, I think it wasn't that the patch
needed rebasing but
I failed to apply it correctly last time. I can apply it now.

> +extern char *btbuildphasename(int64 phasenum);

1. I think int64 is too large a datatype for phasenum.
Also int32 is used for phasenum in pg_indexam_progress_phasename().
Can we have it as int8?

2.

> if ((previous_blkno == InvalidBlockNumber) ||

+ (scan->rs_cblock != previous_blkno))

+ {

+
> pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,

+
> scan->rs_cblock);

+ previous_blkno = scan->rs_cblock;

+ }

. In validate_index_heapscan, we dont calculate blocks_done similar to
IndexBuildHeapScan i.e
block_done += scan->rs_cblock - previous_blkno which IMO is more accurate.
Reporting scan->rs_cblock as blocks_done might give slightly inaccurate
results as we are
still processing that block.

3. There is no if(progress) check in validate_index()/
validate_index_heapscan() code. Wont it be a problem if it is called from
other
index methods which dont support reporting progress at the moment?

4. Just to clarify my understanding can you please see below comment

Quoting your following comment in cluster command progress monitor thread
while referring to progress reporting from IndexBuildHeapScan,

"One, err, small issue with that idea is that we need the param numbers
not to conflict for any "progress update providers" that are to be used
simultaneously by any command."

Does that mean that we can't have any other INDEX progress monitoring, use
PROGRESS_SCAN_BLOCKS_TOTAL and PROGRESS_SCAN_BLOCKS_DONE
parameter numbers to report anything but the metrics they report now.

5.

> 15:56:44.682 | building index (3 of 8): initializing (1/5) |
> 442478 | 442399 | 0 | 0 | 0 |
> 0

15:56:44.694 | building index (3 of 8): initializing (1/5) |
> 442478 | 442399 | 0 | 0 | 0 |
> 0

15:56:44.705 | building index (3 of 8): sorting tuples, spool 1 (3/5) |
> 442478 | 442399 | 100000000 | 0 | 0 |
> 0

15:56:44.716 | building index (3 of 8): sorting tuples, spool 1 (3/5) |
> 442478 | 442399 | 100000000 | 0 | 0 |
> 0

I wonder how is the phase 'building index(3 of 8): initializing(1/5)' when
the blocks_done count is increasing. Shouldn't it have
changed to reflect PROGRESS_BTREE_PHASE_INDEXBUILD_HEAPSCAN as building
index(3 of 8): table scan(2/5) ?
Although I think it has been rectified in the latest patch as I now get
'table scan' phase in output as I do CREATE INDEX on table with 1000000
records

Thank you,
.--
Rahila Syed
Performance Engineer
2ndQuadrant
http://www.2ndQuadrant.com <http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-03-11 09:03:11 Re: Should we increase the default vacuum_cost_limit?
Previous Message Thomas Munro 2019-03-11 09:01:56 Re: I have some troubles to run test_shm_mq;