Re: Performance degradation of REFRESH MATERIALIZED VIEW

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Paul Guo <guopa(at)vmware(dot)com>
Subject: Re: Performance degradation of REFRESH MATERIALIZED VIEW
Date: 2021-05-24 10:37:18
Message-ID: 967b93e1-7ab5-fa40-8a2d-916d86e156bc@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/24/21 9:53 AM, Masahiko Sawada wrote:
> On Sat, May 22, 2021 at 3:10 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> On 5/21/21 6:43 PM, Andres Freund wrote:
>>> Hi,
>>>
>> > ...
>> >
>>>> Attached are the flame graphs for all three cases. The change in master is
>>>> pretty clearly visible, but I don't see any clear difference between old and
>>>> patched code :-(
>>>
>>> I'm pretty sure it's the additional WAL records?
>>>
>>
>> Not sure. If I understand what you suggested elsewhere in the thread, it
>> should be fine to modify heap_insert to pass the page recptr to
>> visibilitymap_set, roughly per the attached patch.
>>
>> I'm not sure it's correct, but it does eliminate the Heap2/VISIBILITY
>> records for me (when applied on top of your patch). Funnily enough it
>> does make it a wee bit slower:
>>
>> patch #1: 56941.505
>> patch #2: 58099.788
>>
>> I wonder if this might be due to -fno-omit-frame-pointer, though, as
>> without it I get these timings:
>>
>> 0c7d3bb99: 25540.417
>> master: 31868.236
>> patch #1: 26566.199
>> patch #2: 26487.943
>>
>> So without the frame pointers there's no slowdown, but there's no clear
>> improvement after removal of the WAL records either :-(
>
> Can we verify that the additional WAL records are the cause of this
> difference by making the matview unlogged by manually updating
> relpersistence = 'u'?
>
> Here are the results of benchmarks with unlogged matviews on my environment:
>
> 1) head: 22.927 sec
> 2) head w/ Andres’s patch: 16.629 sec
> 3) before 39b66a91b commit: 15.377 sec
> 4) head w/o freezing tuples: 14.551 sec
>
> And here are the results of logged matviews ICYMI:
>
> 1) head: 42.397 sec
> 2) head w/ Andres’s patch: 34.857 sec
> 3) before 39b66a91b commit: 32.556 sec
> 4) head w/o freezing tuples: 32.752 sec
>
> There seems no difference in the tendency. Which means the additional
> WAL is not the culprit?
>

Yeah, I agree the WAL does not seem to be the culprit here.

The patch I posted skips the WAL logging entirely (verified by
pg_waldump, although I have not mentioned that), and there's no clear
improvement. (FWIW I'm not sure the patch is 100% correct, but it does
eliminate the the extra WAL.)

The patch however does not skip the whole visibilitymap_set, it still
does the initial error checks. I wonder if that might play a role ...

Another option might be changes in the binary layout - 5% change is well
within the range that could be attributed to this, but it feels very
hand-wavy and more like an excuse than real analysis.

> Interestingly, my previously proposed patch[1] was a better
> performance. With the patch, we skip all VM-related work on all
> insertions except for when inserting a tuple into a page for the first
> time.
>
> logged matviews: 31.591 sec
> unlogged matviews: 15.317 sec
>

Hmmm, thanks for reminding us that patch. Why did we reject that
approach in favor of the current one?

I think at this point we have these two options:

1) Revert the freeze patches, either completely or just the heap_insert
part, which is what seems to be causing issues. And try again in PG15,
perhaps using a different approach, allow disabling freezing in refresh,
or something like that.

2) Polish and commit the pinning patch from Andres, which does reduce
the slowdown quite a bit. And either call it a day, or continue with the
investigation / analysis regarding the remaining ~5% (but I personally
have no idea what might be the problem ...).

I'd like to keep the improvement, but I find the 5% regression rather
annoying and hard to defend, considering how much we fight for every
little improvement.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-05-24 10:51:34 Re: Skipping logical replication transactions on subscriber side
Previous Message Fabien COELHO 2021-05-24 10:31:29 rand48 replacement