Re: WAL usage calculation patch

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Kirill Bychik <kirill(dot)bychik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, amit(dot)kapila16(at)gmail(dot)com
Subject: Re: WAL usage calculation patch
Date: 2020-03-29 12:19:44
Message-ID: 20200329121944.GA79261@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 23, 2020 at 11:24:50PM +0900, Fujii Masao wrote:
>
> > Here are the comments for 0001 patch.
> >
> > +            /*
> > +             * Report a full page image constructed for the WAL record
> > +             */
> > +            pgWalUsage.wal_fp_records++;
> >
> > Isn't it better to use "fpw" or "fpi" for the variable name rather than
> > "fp" here? In other places, "fpw" and "fpi" are used for full page
> > writes/image.

Agreed, I went with fpw.

> > ISTM that this counter could be incorrect if XLogInsertRecord() determines to
> > calculate again whether FPI is necessary or not. No? IOW, this issue could
> > happen if XLogInsert() calls  XLogRecordAssemble() multiple times in
> > its do-while loop. Isn't this problematic?

Yes probably. I also see while adding support for EXPLAIN/auto_explain that
the previous approach was incrementing both records and fpw_records, while it
should be only one of those for each record. I fixed this using the approach I
previously mentionned in [1] which seems to work just fine.

> > +    long        wal_bytes;        /* size of wal records produced */
> >
> > Isn't it safer to use uint64 (i.e., XLogRecPtr) as the type of this variable
> > rather than long?

Yes indeed. I switched to uint64, and modified everything accordingly (and
changed pgss to output numeric as there's no other way to handle unsigned int8)

> > +    shm_toc_insert(pcxt->toc, PARALLEL_KEY_WAL_USAGE, bufusage_space);
> >
> > bufusage_space should be walusage_space here?

Good catch, fixed.

> > /*
> >  * Finish parallel execution.  We wait for parallel workers to finish, and
> >  * accumulate their buffer usage.
> >  */
> >
> > There are some comments mentioning buffer usage, in execParallel.c.
> > For example, the top comment for ExecParallelFinish(), as the above.
> > These should be updated.

I went through all the file and quickly checked in other places, and I think I
fixed all required comments.

> Here are the comments for 0002 patch.
>
> + OUT wal_write_bytes int8,
> + OUT wal_write_records int8,
> + OUT wal_write_fp_records int8
>
> Isn't "write" part in the column names confusing because it's WAL
> *generated* (not written) by the statement?

Agreed, I simply dropped the "_write" part everywhere.

> +RETURNS SETOF record
> +AS 'MODULE_PATHNAME', 'pg_stat_statements_1_4'
> +LANGUAGE C STRICT VOLATILE;
>
> PARALLEL SAFE should be specified?

Indeed, fixed.

> +/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */
>
> ISTM it's good timing to have also pg_stat_statements--1.8.sql since
> the definition of pg_stat_statements() is changed. Thought?

As mentionned in other pgss thread, I think the general agreement is to never
provide full script anymore, so I didn't changed that.

> +-- CHECKPOINT before WAL tests to ensure test stability
> +CHECKPOINT;
>
> Is this true? I thought you added this because the number of FPI
> should be larger than zero in the subsequent test. No? But there
> seems no such test. I'm not excited about adding the test checking
> the number of FPI because it looks fragile, though...

It should ensure a FPW for each new block touch, but yes that's quite fragile.

Since I fixed the record / FPW record counters, I saw that this was actually
already broken as there was a mix of FPW and non-FPW, so I dropped the
checkpoint and just tested (wal_record + wal_fpw_record) instead.

> +UPDATE pgss_test SET b = '333' WHERE a = 3 \;
> +UPDATE pgss_test SET b = '444' WHERE a = 4 ;
>
> Could you tell me why several queries need to be run to test
> the WAL usage? Isn't running a few query enough for the test purpase?

As far as I can see it's used to test multiple scenario (single command /
multiple commands in or outside explicit transaction). It shouldn't add a lot
of overhead and since some commands are issues with "\;" it's also testing
proper query string isolation when multi-command query string is provided,
which doesn't seem like a bad idea. I didn't changed that but I'm not opposed
to remove some of the updates if needed.

Also, to answer Amit Kapila's comments about WAL records and parallel query, I
added support for both EXPLAIN and auto_explain (tab completion and
documentation are also updated), and using a simple table with an index, with
forced parallelism and no leader participation and concurrent update on the
same table, I could test that WAL usage is working as expected:

rjuju=# explain (analyze, wal, verbose) select * from t1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.00..8805.05 rows=100010 width=14) (actual time=8.695..47.592 rows=100010 loops=1)
Output: id, val
Workers Planned: 2
Workers Launched: 2
WAL: records=204 bytes=86198
-> Parallel Seq Scan on public.t1 (cost=0.00..8805.05 rows=50005 width=14) (actual time=0.056..29.112 rows=50005 loops
Output: id, val
WAL: records=204 bytes=86198
Worker 0: actual time=0.060..28.995 rows=49593 loops=1
WAL: records=105 bytes=44222
Worker 1: actual time=0.052..29.230 rows=50417 loops=1
WAL: records=99 bytes=41976
Planning Time: 0.038 ms
Execution Time: 53.957 ms
(14 rows)

and the same query when nothing end up being modified:

rjuju=# explain (analyze, wal, verbose) select * from t1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.00..8805.05 rows=100010 width=14) (actual time=9.413..48.187 rows=100010 loops=1)
Output: id, val
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on public.t1 (cost=0.00..8805.05 rows=50005 width=14) (actual time=0.033..24.697 rows=50005 loops
Output: id, val
Worker 0: actual time=0.028..24.786 rows=50447 loops=1
Worker 1: actual time=0.038..24.609 rows=49563 loops=1
Planning Time: 0.282 ms
Execution Time: 55.643 ms
(10 rows)

So it seems to me that WAL usage infrastructure for parallel query is working
just fine. I added the EXPLAIN/auto_explain in a separate commit just in case.

[1] https://www.postgresql.org/message-id/CAOBaU_aECK1Z7Nn+x=MhvEwrJzK8wyPsPtWAafjqtZN1fYjEmg@mail.gmail.com

Attachment Content-Type Size
v7-0001-Add-infrastructure-to-track-WAL-usage.patch text/plain 15.0 KB
v7-0002-Add-option-to-report-WAL-usage-in-EXPLAIN-and-aut.patch text/plain 11.3 KB
v7-0003-Keep-track-of-WAL-usage-in-pg_stat_statements.patch text/plain 22.0 KB
v7-0004-Expose-WAL-usage-counters-in-verbose-auto-vacuum-.patch text/plain 3.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2020-03-29 12:31:41 Re: WAL usage calculation patch
Previous Message Masahiko Sawada 2020-03-29 11:44:24 Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)