Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)
Date: 2020-04-02 04:43:31
Message-ID: CAFiTN-tFpy9y5Sv0uLpoT8FH7wgwBe56Tsi7_YON4MBsc1UGzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 2, 2020 at 9:13 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Thu, Apr 2, 2020 at 8:34 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> >
> > On Wed, Apr 1, 2020 at 7:52 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > Peter, Is this behavior expected?
> > >
> > > Let me summarize the situation so that it would be easier for Peter to
> > > comment. Julien has noticed that parallel vacuum and parallel create
> > > index doesn't seem to report correct values for buffer usage stats.
> > > Sawada-San wrote a patch to fix the problem for both the cases. We
> > > expect that 'total_read_blks' as reported in pg_stat_statements should
> > > give the same value for parallel and non-parallel operations. We see
> > > that is true for parallel vacuum and previously we have the same
> > > observation for the parallel query. Now, for parallel create index
> > > this doesn't seem to be true as test results by Dilip show that. We
> > > have two possibilities here (a) there is some bug in Sawada-San's
> > > patch or (b) this is expected behavior for parallel create index.
> > > What do you think?
> >
> > nbtree CREATE INDEX doesn't even go through the buffer manager.
>
> Thanks for clarifying. So IIUC, it will not go through the buffer
> manager for the index pages, but for the heap pages, it will still go
> through the buffer manager.
>
> > The
> > difference that Dilip showed is probably due to extra catalog accesses
> > in the two parallel workers -- pg_amproc lookups, and the like. Those
> > are rather small differences, overall.
>
> > Can Dilip demonstrate the the "extra" buffer accesses are
> > proportionate to the number of workers launched in some constant,
> > predictable way?
>
> Okay, I will test this.

0-worker
query | total_time | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written
------------------------------+-------------+-----------------+------------------+-----------------+---------------------+---------------------
CREATE INDEX idx1 on test(a) | 1228.895057 | 8947 |
11 | 8971 | 5 |
0

1-worker
query | total_time | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written
------------------------------+-------------+-----------------+------------------+-----------------+---------------------+---------------------
CREATE INDEX idx1 on test(a) | 1006.157231 | 8962 |
12 | 8974 | 5 |
0

2-workers
query | total_time | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written
------------------------------+------------+-----------------+------------------+-----------------+---------------------+---------------------
CREATE INDEX idx1 on test(a) | 949.44663 | 8965 |
12 | 8977 | 5 | 0

3-workers
query | total_time | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written
------------------------------+-------------+-----------------+------------------+-----------------+---------------------+---------------------
CREATE INDEX idx1 on test(a) | 1037.297196 | 8968 |
12 | 8980 | 5 |
0

4-workers
query | total_time | shared_blks_hit |
shared_blks_read | total_read_blks | shared_blks_dirtied |
shared_blks_written
------------------------------+------------+-----------------+------------------+-----------------+---------------------+---------------------
CREATE INDEX idx1 on test(a) | 889.332782 | 8971 |
12 | 8983 | 6 | 0

You are right, it is increasing with some constant factor.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Lepikhov 2020-04-02 04:50:58 Re: NOT IN subquery optimization
Previous Message Amit Kapila 2020-04-02 04:43:18 User Interface for WAL usage data