Re: WAL usage calculation patch

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Kirill Bychik <kirill(dot)bychik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: WAL usage calculation patch
Date: 2020-03-27 08:51:59
Message-ID: CAOBaU_bb6gf-33qCsMiZYy6EupqAcT5xga3iM8q-=-Ap=+9w_w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 23, 2020 at 3:24 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2020/03/23 21:01, Fujii Masao wrote:
> >
> >
> > On 2020/03/23 7:32, Kirill Bychik wrote:
> >>>>> I'm attaching a v5 with fp records only for temp tables, so there's no risk of
> >>>>> instability. As I previously said I'm fine with your two patches, so unless
> >>>>> you have objections on the fpi test for temp tables or the documentation
> >>>>> changes, I believe those should be ready for committer.
> >>>>
> >>>> You added the columns into pg_stat_database, but seem to forget to
> >>>> update the document for pg_stat_database.
> >>>
> >>> Ah right, I totally missed that when I tried to clean up the original POC.
> >>>
> >>>> Is it really reasonable to add the columns for vacuum's WAL usage into
> >>>> pg_stat_database? I'm not sure how much the information about
> >>>> the amount of WAL generated by vacuum per database is useful.
> >>>
> >>> The amount per database isn't really useful, but I didn't had a better idea on
> >>> how to expose (auto)vacuum WAL usage until this:
> >>>
> >>>> Isn't it better to make VACUUM VERBOSE and autovacuum log include
> >>>> that information, instead, to see how much each vacuum activity
> >>>> generates the WAL? Sorry if this discussion has already been done
> >>>> upthread.
> >>>
> >>> That's a way better idea! I'm attaching the full patchset with the 3rd patch
> >>> to use this approach instead. There's a bit a duplicate code for computing the
> >>> WalUsage, as I didn't find a better way to avoid that without exposing
> >>> WalUsageAccumDiff().
> >>>
> >>> Autovacuum log sample:
> >>>
> >>> 2020-03-19 15:49:05.708 CET [5843] LOG: automatic vacuum of table "rjuju.public.t1": index scans: 0
> >>> pages: 0 removed, 2213 remain, 0 skipped due to pins, 0 skipped frozen
> >>> tuples: 250000 removed, 250000 remain, 0 are dead but not yet removable, oldest xmin: 502
> >>> buffer usage: 4448 hits, 4 misses, 4 dirtied
> >>> avg read rate: 0.160 MB/s, avg write rate: 0.160 MB/s
> >>> system usage: CPU: user: 0.13 s, system: 0.00 s, elapsed: 0.19 s
> >>> WAL usage: 6643 records, 4 full page records, 1402679 bytes
> >>>
> >>> VACUUM log sample:
> >>>
> >>> # vacuum VERBOSE t1;
> >>> INFO: vacuuming "public.t1"
> >>> INFO: "t1": removed 50000 row versions in 443 pages
> >>> INFO: "t1": found 50000 removable, 0 nonremovable row versions in 443 out of 443 pages
> >>> DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 512
> >>> There were 50000 unused item identifiers.
> >>> Skipped 0 pages due to buffer pins, 0 frozen pages.
> >>> 0 pages are entirely empty.
> >>> 1332 WAL records, 4 WAL full page records, 306901 WAL bytes
> >>> CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s.
> >>> INFO: "t1": truncated 443 to 0 pages
> >>> DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
> >>> INFO: vacuuming "pg_toast.pg_toast_16385"
> >>> INFO: index "pg_toast_16385_index" now contains 0 row versions in 1 pages
> >>> DETAIL: 0 index row versions were removed.
> >>> 0 index pages have been deleted, 0 are currently reusable.
> >>> CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
> >>> INFO: "pg_toast_16385": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
> >>> DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 513
> >>> There were 0 unused item identifiers.
> >>> Skipped 0 pages due to buffer pins, 0 frozen pages.
> >>> 0 pages are entirely empty.
> >>> 0 WAL records, 0 WAL full page records, 0 WAL bytes
> >>> CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
> >>> VACUUM
> >>>
> >>> Note that the 3rd patch is an addition on top of Kirill's original patch, as
> >>> this is information that would have been greatly helpful to investigate in some
> >>> performance issues I had to investigate recently. I'd be happy to have it land
> >>> into v13, but if that's controversial or too late I'm happy to postpone it to
> >>> v14 if the infrastructure added in Kirill's patches can make it to v13.
> >>
> >> Dear all, can we please focus on getting the core patch committed?
> >> Given the uncertainity regarding autovacuum stats, can we please get
> >> parts 1 and 2 into the codebase, and think about exposing autovacuum
> >> stats later?
> >
> > Here are the comments for 0001 patch.
> >
> > + /*
> > + * Report a full page image constructed for the WAL record
> > + */
> > + pgWalUsage.wal_fp_records++;
> >
> > Isn't it better to use "fpw" or "fpi" for the variable name rather than
> > "fp" here? In other places, "fpw" and "fpi" are used for full page
> > writes/image.
> >
> > ISTM that this counter could be incorrect if XLogInsertRecord() determines to
> > calculate again whether FPI is necessary or not. No? IOW, this issue could
> > happen if XLogInsert() calls XLogRecordAssemble() multiple times in
> > its do-while loop. Isn't this problematic?
> >
> > + long wal_bytes; /* size of wal records produced */
> >
> > Isn't it safer to use uint64 (i.e., XLogRecPtr) as the type of this variable
> > rather than long?
> >
> > + shm_toc_insert(pcxt->toc, PARALLEL_KEY_WAL_USAGE, bufusage_space);
> >
> > bufusage_space should be walusage_space here?
> >
> > /*
> > * Finish parallel execution. We wait for parallel workers to finish, and
> > * accumulate their buffer usage.
> > */
> >
> > There are some comments mentioning buffer usage, in execParallel.c.
> > For example, the top comment for ExecParallelFinish(), as the above.
> > These should be updated.
>
> Here are the comments for 0002 patch.
>
> + OUT wal_write_bytes int8,
> + OUT wal_write_records int8,
> + OUT wal_write_fp_records int8
>
> Isn't "write" part in the column names confusing because it's WAL
> *generated* (not written) by the statement?
>
> +RETURNS SETOF record
> +AS 'MODULE_PATHNAME', 'pg_stat_statements_1_4'
> +LANGUAGE C STRICT VOLATILE;
>
> PARALLEL SAFE should be specified?
>
> +/* contrib/pg_stat_statements/pg_stat_statements--1.7--1.8.sql */
>
> ISTM it's good timing to have also pg_stat_statements--1.8.sql since
> the definition of pg_stat_statements() is changed. Thought?
>
> +-- CHECKPOINT before WAL tests to ensure test stability
> +CHECKPOINT;
>
> Is this true? I thought you added this because the number of FPI
> should be larger than zero in the subsequent test. No? But there
> seems no such test. I'm not excited about adding the test checking
> the number of FPI because it looks fragile, though...
>
> +UPDATE pgss_test SET b = '333' WHERE a = 3 \;
> +UPDATE pgss_test SET b = '444' WHERE a = 4 ;
>
> Could you tell me why several queries need to be run to test
> the WAL usage? Isn't running a few query enough for the test purpase?

FTR I marked the commitfest entry as waiting on author.

Kirill do you think you'll have time to address Fuji-san's review
shortly? The end of the commitfest is approaching quite fast :(

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-03-27 08:54:47 Re: Some problems of recovery conflict wait events
Previous Message Masahiko Sawada 2020-03-27 07:58:52 Re: error context for vacuum to include block number