Re: WAL usage calculation patch

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Kirill Bychik <kirill(dot)bychik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: WAL usage calculation patch
Date: 2020-04-02 09:02:07
Message-ID: CAA4eK1+CDwmKyJeDYHZ4xFftK9TRW5pF+kkPsgGY8UAJqFNJKQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 2, 2020 at 2:00 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Thu, Apr 02, 2020 at 11:07:29AM +0530, Amit Kapila wrote:
> > On Wed, Apr 1, 2020 at 8:00 PM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Apr 01, 2020 at 04:29:16PM +0530, Amit Kapila wrote:
> > > > 3. Doing some testing with and without parallelism to ensure WAL usage
> > > > data is correct would be great and if possible, share the results?
> > >
> > >
> > > I just saw that Dilip did some testing, but just in case here is some
> > > additional one
> > >
> > > - vacuum, after a truncate, loading 1M row and a "UPDATE t1 SET id = id"
> > >
> > > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%vacuum%';
> > > query | calls | wal_bytes | wal_records | wal_num_fpw
> > > ------------------------+-------+-----------+-------------+-------------
> > > vacuum (parallel 3) t1 | 1 | 20098962 | 34104 | 2
> > > vacuum (parallel 0) t1 | 1 | 20098962 | 34104 | 2
> > > (2 rows)
> > >
> > > - create index, overload t1's parallel_workers, using the 1M line just
> > > vacuumed:
> > >
> > > =# alter table t1 set (parallel_workers = 2);
> > > ALTER TABLE
> > >
> > > =# create index t1_parallel_2 on t1(id);
> > > CREATE INDEX
> > >
> > > =# alter table t1 set (parallel_workers = 0);
> > > ALTER TABLE
> > >
> > > =# create index t1_parallel_0 on t1(id);
> > > CREATE INDEX
> > >
> > > =# select query, calls, wal_bytes, wal_records, wal_num_fpw from pg_stat_statements where query ilike '%create index%';
> > > query | calls | wal_bytes | wal_records | wal_num_fpw
> > > --------------------------------------+-------+-----------+-------------+-------------
> > > create index t1_parallel_0 on t1(id) | 1 | 20355540 | 2762 | 2745
> > > create index t1_parallel_2 on t1(id) | 1 | 20406811 | 2762 | 2758
> > > (2 rows)
> > >
> > > It all looks good to me.
> > >
> >
> > Here the wal_num_fpw and wal_bytes are different between parallel and
> > non-parallel versions. Is it due to checkpoint or something else? We
> > can probably rule out checkpoint by increasing checkpoint_timeout and
> > other checkpoint related parameters.
>
> I think this is because I did a checkpoint after the VACUUM tests, so the 1st
> CREATE INDEX (with parallelism) induced some FPW on the catalog blocks. I
> didn't try to investigate more since:
>

We need to do this.

> On Thu, Apr 02, 2020 at 11:22:16AM +0530, Amit Kapila wrote:
> >
> > Also, I forgot to mention that let's not base this on buffer usage
> > patch for create index
> > (v10-0002-Allow-parallel-index-creation-to-accumulate-buff) because as
> > per recent discussion I am not sure about its usefulness. I think we
> > can proceed with this patch without
> > v10-0002-Allow-parallel-index-creation-to-accumulate-buff as well.
>
>
> Which is done in attached v11.
>

Hmm, I haven't suggested removing the WAL usage from the parallel
create index. I just told not to use the infrastructure of another
patch. We bypass the buffer manager but do write WAL. See
_bt_blwritepage->log_newpage. So we need to accumulate WAL usage even
if we decide not to do anything about BufferUsage which means we need
to investigate the above inconsistency in wal_num_fpw and wal_bytes
between parallel and non-parallel version.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2020-04-02 09:22:31 Re: truncating timestamps on arbitrary intervals
Previous Message Pavel Stehule 2020-04-02 08:52:01 Re: [Proposal] Global temporary tables