Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)
Date: 2020-04-07 07:59:39
Message-ID: CA+fd4k4gaY0OqQiPnhxQiRvb7fM0Gr4CaY9m38DiihucQ0xjEg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 7 Apr 2020 at 02:40, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Mon, Apr 6, 2020 at 2:21 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > AFAIU, it uses heapam_index_build_range_scan but for writing to index,
> > it doesn't use buffer manager.
>
> Right. It doesn't need to use the buffer manager to write to the
> index, unlike (say) GIN's CREATE INDEX.

Hmm, after more thoughts and testing, it seems to me that parallel
btree index creation uses buffer manager while scanning the table in
parallel, i.e in heapam_index_build_range_scan, which affects
shared_blks_xxx in pg_stat_statements. I've some parallel create index
tests with the current HEAD and with the attached patch. The table has
44248 blocks.

HEAD, no workers:

-[ RECORD 1 ]-------+----------
total_plan_time | 0
total_plan_time | 0
shared_blks_hit | 148
shared_blks_read | 44281
total_read_blks | 44429
shared_blks_dirtied | 44261
shared_blks_written | 24644
wal_records | 71693
wal_num_fpw | 71682
wal_bytes | 566815038

HEAD, 4 workers:

-[ RECORD 1 ]-------+----------
total_plan_time | 0
total_plan_time | 0
shared_blks_hit | 160
shared_blks_read | 8892
total_read_blks | 9052
shared_blks_dirtied | 8871
shared_blks_written | 5342
wal_records | 71693
wal_num_fpw | 71682
wal_bytes | 566815038

The WAL usage statistics are good but the buffer usage statistics seem
not correct.

Patched, no workers:

-[ RECORD 1 ]-------+----------
total_plan_time | 0
total_plan_time | 0
shared_blks_hit | 148
shared_blks_read | 44281
total_read_blks | 44429
shared_blks_dirtied | 44261
shared_blks_written | 24843
wal_records | 71693
wal_num_fpw | 71682
wal_bytes | 566815038

Patched, 4 workers:

-[ RECORD 1 ]-------+----------
total_plan_time | 0
total_plan_time | 0
shared_blks_hit | 172
shared_blks_read | 44282
total_read_blks | 44454
shared_blks_dirtied | 44261
shared_blks_written | 26968
wal_records | 71693
wal_num_fpw | 71682
wal_bytes | 566815038

Buffer usage statistics seem correct. The small differences would be
catalog lookups Peter mentioned.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
bufferusage_create_index_v4.patch application/octet-stream 5.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-04-07 08:07:44 Re: [patch] Fix pg_checksums to allow checking of offline base backup directories
Previous Message Kyotaro Horiguchi 2020-04-07 07:48:01 Re: Don't try fetching future segment of a TLI.