Re: WAL usage calculation patch

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Kirill Bychik <kirill(dot)bychik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: WAL usage calculation patch
Date: 2020-04-02 08:30:35
Message-ID: 20200402083035.GB64485@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 02, 2020 at 11:07:29AM +0530, Amit Kapila wrote:
> On Wed, Apr 1, 2020 at 8:00 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> >
> > On Wed, Apr 01, 2020 at 04:29:16PM +0530, Amit Kapila wrote:
> > > 3. Doing some testing with and without parallelism to ensure WAL usage
> > > data is correct would be great and if possible, share the results?
> >
> >
> > I just saw that Dilip did some testing, but just in case here is some
> > additional one
> >
> > - vacuum, after a truncate, loading 1M row and a "UPDATE t1 SET id = id"
> >
> > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%vacuum%';
> > query | calls | wal_bytes | wal_records | wal_num_fpw
> > ------------------------+-------+-----------+-------------+-------------
> > vacuum (parallel 3) t1 | 1 | 20098962 | 34104 | 2
> > vacuum (parallel 0) t1 | 1 | 20098962 | 34104 | 2
> > (2 rows)
> >
> > - create index, overload t1's parallel_workers, using the 1M line just
> > vacuumed:
> >
> > =# alter table t1 set (parallel_workers = 2);
> > ALTER TABLE
> >
> > =# create index t1_parallel_2 on t1(id);
> > CREATE INDEX
> >
> > =# alter table t1 set (parallel_workers = 0);
> > ALTER TABLE
> >
> > =# create index t1_parallel_0 on t1(id);
> > CREATE INDEX
> >
> > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%create index%';
> > query | calls | wal_bytes | wal_records | wal_num_fpw
> > --------------------------------------+-------+-----------+-------------+-------------
> > create index t1_parallel_0 on t1(id) | 1 | 20355540 | 2762 | 2745
> > create index t1_parallel_2 on t1(id) | 1 | 20406811 | 2762 | 2758
> > (2 rows)
> >
> > It all looks good to me.
> >
>
> Here the wal_num_fpw and wal_bytes are different between parallel and
> non-parallel versions. Is it due to checkpoint or something else? We
> can probably rule out checkpoint by increasing checkpoint_timeout and
> other checkpoint related parameters.

I think this is because I did a checkpoint after the VACUUM tests, so the 1st
CREATE INDEX (with parallelism) induced some FPW on the catalog blocks. I
didn't try to investigate more since:

On Thu, Apr 02, 2020 at 11:22:16AM +0530, Amit Kapila wrote:
>
> Also, I forgot to mention that let's not base this on buffer usage
> patch for create index
> (v10-0002-Allow-parallel-index-creation-to-accumulate-buff) because as
> per recent discussion I am not sure about its usefulness. I think we
> can proceed with this patch without
> v10-0002-Allow-parallel-index-creation-to-accumulate-buff as well.

Which is done in attached v11.

> > > 5.
> > > -SELECT query, calls, rows FROM pg_stat_statements ORDER BY query COLLATE "C";
> > > - query | calls | rows
> > > ------------------------------------+-------+------
> > > - SELECT $1::TEXT | 1 | 1
> > > - SELECT PLUS_ONE($1) | 2 | 2
> > > - SELECT PLUS_TWO($1) | 2 | 2
> > > - SELECT pg_stat_statements_reset() | 1 | 1
> > > +SELECT query, calls, rows, wal_bytes, wal_records FROM
> > > pg_stat_statements ORDER BY query COLLATE "C";
> > > + query | calls | rows | wal_bytes | wal_records
> > > +-----------------------------------+-------+------+-----------+-------------
> > > + SELECT $1::TEXT | 1 | 1 | 0 | 0
> > > + SELECT PLUS_ONE($1) | 2 | 2 | 0 | 0
> > > + SELECT PLUS_TWO($1) | 2 | 2 | 0 | 0
> > > + SELECT pg_stat_statements_reset() | 1 | 1 | 0 | 0
> > > (4 rows)
> > >
> > > Again, I am not sure if these modifications make much sense?
> >
> >
> > Those are queries that were previously executed. As those are read-only query,
> > that are pretty much guaranteed to not cause any WAL activity, I don't see how
> > it hurts to test at the same time that that's we indeed record with
> > pg_stat_statements, just to be safe.
> >
>
> On a similar theory, one could have checked bufferusage stats as well.
> The statements are using some expressions so don't see any value in
> check all usage data for such statements.

Dropped.

> Right now, that particular patch is not getting applied (probably due
> to recent commit 17e0328224). Can you rebase it?

Done.

> > > v9-0004-Add-option-to-report-WAL-usage-in-EXPLAIN-and-aut
> > >
> > > 3.
> > > + if (usage->wal_num_fpw > 0)
> > > + appendStringInfo(es->str, " full page records=%ld",
> > > + usage->wal_num_fpw);
> > > + if (usage->wal_bytes > 0)
> > > + appendStringInfo(es->str, " bytes=" UINT64_FORMAT,
> > > + usage->wal_bytes);
> > >
> > > Shall we change to 'full page writes' or 'full page image' instead of
> > > full page records?
> >
> >
> > Indeed, I changed it in the (auto)vacuum output but missed this one. Fixed.
> >
>
> I don't see this change in the patch.

Yes, as Dilip reported I fixuped the wrong commit, sorry about that. This
version should now be ok.

On Thu, Apr 02, 2020 at 12:04:32PM +0530, Dilip Kumar wrote:
> On Wed, Apr 1, 2020 at 8:00 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> >
> > > By tomorrow, I will try to finish reviewing 0005 and 0006.
>
> I have reviewed these patches and I have a few cosmetic comments.
> v10-0005-Keep-track-of-WAL-usage-in-pg_stat_statements
>
> 1.
> + uint64 wal_bytes; /* total amount of wal bytes written */
> + int64 wal_records; /* # of wal records written */
> + int64 wal_num_fpw; /* # of full page wal records written */
>
>
> /s/# of full page wal records written / /* # of WAL full page image produced */

Done, I also consistently s/wal/WAL/.

>
> 2.
> static void pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
> ProcessUtilityContext context, ParamListInfo params,
> QueryEnvironment *queryEnv,
> - DestReceiver *dest, QueryCompletion *qc);
> + DestReceiver *dest, QueryCompletion * qc);
>
> Useless hunk.

Oops, leftover of a pgindent as QueryCompletion isn't in the typedefs yet. I
thought I discarded all the useless hunks but missed this one. Thanks, fixed.

>
> 3.
>
> v10-0006-Expose-WAL-usage-counters-in-verbose-auto-vacuum
>
> @@ -3105,7 +3105,7 @@ show_wal_usage(ExplainState *es, const WalUsage *usage)
> {
> ExplainPropertyInteger("WAL records", NULL,
> usage->wal_records, es);
> - ExplainPropertyInteger("WAL full page records", NULL,
> + ExplainPropertyInteger("WAL full page writes", NULL,
> usage->wal_num_fpw, es);
> Just noticed that in 0004 you have first added "WAL full page
> records", which is later corrected to "WAL full page writes" in 0006.
> I think we better keep this proper in 0004 itself and avoid this hunk
> in 0006, otherwise, it creates confusion while reviewing.

Oh, I didn't realized that I fixuped the wrong commit. Fixed.

I also adapted the documentation that mentioned full page records instead of
full page images, and integrated Justin's comment:

> In 0003:
> + /* Provide WAL update data to the instrumentation */
> Remove "data" ??

so changed to "Report WAL traffic to the instrumentation."

I didn't change the (auto)vacuum output yet (except fixing the s/full page
records/full page writes/ that I previously missed), as it's not clear what the
consensus is yet. I'll take care of that as soon as we reach to a consensus.

Attachment Content-Type Size
v11-0001-Add-infrastructure-to-track-WAL-usage.patch text/plain 22.1 KB
v11-0002-Add-option-to-report-WAL-usage-in-EXPLAIN-and-au.patch text/plain 11.3 KB
v11-0003-Keep-track-of-WAL-usage-in-pg_stat_statements.patch text/plain 12.9 KB
v11-0004-Expose-WAL-usage-counters-in-verbose-auto-vacuum.patch text/plain 2.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-04-02 08:43:28 Re: Add A Glossary
Previous Message Kyotaro Horiguchi 2020-04-02 08:25:40 EINTR while resizing dsm segment.