Re: Performance degradation of REFRESH MATERIALIZED VIEW

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Paul Guo <guopa(at)vmware(dot)com>
Subject: Re: Performance degradation of REFRESH MATERIALIZED VIEW
Date: 2021-05-19 02:56:40
Message-ID: CAD21AoBnP_BvJAy-bgweQr1_WJ8wPCfOuyf_Nfbcsmkn9mHruA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 19, 2021 at 3:08 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2021-05-18 11:20:07 +0900, Masahiko Sawada wrote:
> > Yes. It depends on how much the matview refresh gets slower but I
> > think the problem here is that users always are forced to pay the cost
> > for freezing tuple during refreshing the matview. There is no way to
> > disable it unlike FREEZE option of COPY command.
> >
> > I’ve done benchmarks for matview refresh on my machine (FreeBSD 12.1,
> > AMD Ryzen 5 PRO 3400GE, 24GB RAM) with four codes: HEAD, HEAD +
> > Andres’s patch, one before 39b66a91b, and HEAD without
> > TABLE_INSERT_FROZEN.
> >
> > The workload is to refresh the matview that simply selects 50M tuples
> > (about 1.7 GB). Here are the average execution times of three trials
> > for each code:
> >
> > 1) head: 42.263 sec
> > 2) head w/ Andres’s patch: 40.194 sec
> > 3) before 39b66a91b commit: 38.143 sec
> > 4) head w/o freezing tuples: 32.413 sec
>
> I don't see such a big difference between andres-freeze/non-freeze. Is
> there any chance there's some noise in there? I found that I need to
> disable autovacuum and ensure that there's a checkpoint just before the
> REFRESH to get halfway meaningful numbers, as well as a min/max_wal_size
> ensuring that only recycled WAL is used.

I've ran the same benchmarks with the following parameters:

shared_buffers = 10GB
max_wal_size = 50GB
min_wal_size = 50GB
checkpoint_timeout = 1h
maintenance_work_mem = 1GB
work_mem = 512MB
autovacuum = off

1) head: 42.397 sec
2) head w/ Andres’s patch: 34.857 sec
3) before 39b66a91b commit: 32.556 sec
4) head w/o freezing tuples: 32.752 sec

There is 6% degradation between 2 and 4 but 2 is much better than the
previous tests.

>
>
> > I also observed 5% degradation by comparing 1 and 2 but am not sure
> > where the overhead came from. I agree with Andres’s proposal. It’s a
> > straightforward approach.
>
> What degradation are you referencing here?

Sorry, I meant comparing 2 to 3 and 4.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message osumi.takamichi@fujitsu.com 2021-05-19 02:58:01 RE: Forget close an open relation in ReorderBufferProcessTXN()
Previous Message Kyotaro Horiguchi 2021-05-19 02:50:52 Re: Multiple pg_waldump --rmgr options