Re: monitoring CREATE INDEX [CONCURRENTLY]

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Rahila Syed <rahila(dot)syed(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: monitoring CREATE INDEX [CONCURRENTLY]
Date: 2019-03-28 16:07:21
Message-ID: CANP8+j+6Z4+yON9k+kw3jBOiRYqnXvnSHysV_iPAsc4AvRyWKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 28 Mar 2019 at 15:39, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> On 2019-Mar-28, Simon Riggs wrote:
>
> > On Thu, 28 Mar 2019 at 14:56, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
> > wrote:
> >
> > > I have not reinstated phase numbers; I have Rahila's positive vote for
> > > them. Do I hear any more votes on this issue?
> >
> > If there is a specific technical issue, I'd like to understand that more.
>
> There's no technical issue -- that's pretty straightforward. Earlier
> versions of the patch had them, and removing them only meant editing
> strings in a couple of places.
>
> > If it is just a usability preference, then I say we should have numbers.
> >
> > Numbering is natural for people. If we say "It's currently doing phase
> > XYZ", they will say "Is that the 3rd phase?", we'll say "No, actually the
> > 5th", and then they will say "Why didn't you just number them?"
>
> There are eight phases. If you run normal CREATE INDEX (not concurrent)
> then you get phases 1, then 3, done. If you run CIC you get phases from
> 1 to 8. Phase 3 "building index" has arbitrary subphases (they depend
> on AM) in both cases.
>

Maybe the AM won't know, but I don't think that matters. It's still useful
to know the difference between Phase 3.3 and Phase 3.33 and Phase 7.

The description only helps you if you understand what it means. If your AM
replies something many users wouldn't understand like "сортировка" or
"constructing triples", we still want to know where that step fits in the
overall sequence of steps.

> I think the lack of phase numbering comes from the fact that the first
> command we did (VACUUM) sometimes jumps backwards in phase numbers, so
> it would be a bit absurd from users's POV.
>

Seems more like our own labelling of the phases is responsible for that,
rather than it being a specific problem. The numbering should reflect the
ordinal executed step number. So if a VACUUM has required two sets of index
scanning, the heap scan phase (normally phase 3) should be labelled phase 6
when it occurs the second time, rather than "phase 3 again, doh" which
clearly doesn't work.

By the time VACUUM moves to its 2nd phase, which is normally thought of as
"Phase2 Index Scanning", we know how much of the table has been scanned, so
we really should be able to calculate how many more phases will be needed.
We also know how many AM sub-phases will be called for that step.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2019-03-28 16:08:33 Re: Online verification of checksums
Previous Message Masahiko Sawada 2019-03-28 15:42:26 Re: Berserk Autovacuum (let's save next Mandrill)