pg_stat_progress_create_index vs. parallel index builds

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: pg_stat_progress_create_index vs. parallel index builds
Date: 2021-06-02 11:56:55
Message-ID: 1128176d-1eee-55d4-37ca-e63644422adb@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

While experimenting with parallel index builds, I've noticed a somewhat
strange behavior of pg_stat_progress_create_index when a btree index is
built with parallel workers - some of the phases seem to be missing.

In serial (no parallelism) mode, the progress is roughly this (it's
always the first/last timestamp of each phase):

| command | phase
-------------+--------------+----------------------------------------
12:56:01 AM | CREATE INDEX | building index: scanning table
...
01:06:22 AM | CREATE INDEX | building index: scanning table
01:06:23 AM | CREATE INDEX | building index: sorting live tuples
...
01:13:10 AM | CREATE INDEX | building index: sorting live tuples
01:13:11 AM | CREATE INDEX | building index: loading tuples in tree
...
01:24:02 AM | CREATE INDEX | building index: loading tuples in tree

So it goes through three phases:

1) scanning tuples
2) sorting live tuples
3) loading tuples in tree

But with parallel build index build, it changes to:

| command | phase
-------------+--------------+----------------------------------------
11:40:48 AM | CREATE INDEX | building index: scanning table
...
11:47:24 AM | CREATE INDEX | building index: scanning table (scan
complete)
11:56:22 AM | CREATE INDEX | building index: scanning table
11:56:23 AM | CREATE INDEX | building index: loading tuples in tree
...
12:05:33 PM | CREATE INDEX | building index: loading tuples in tree

That is, the "sorting live tuples" phase disappeared, and instead it
seems to be counted in the "scanning table" one, as if there was an
update of the phase missing.

I've only tried this on master, but I assume it behaves like this in the
older releases too. I wonder if this is intentional - it sure is a bit
misleading.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Wildish 2021-06-02 12:19:17 Re: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers
Previous Message tanghy.fnst@fujitsu.com 2021-06-02 11:06:41 RE: tab-complete for CREATE TYPE ... SUBSCRIPT