Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
Cc: Önder Kalacı <onderkalaci(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2023-03-02 09:15:57
Message-ID: CAA4eK1JxocgW+CKUV=bcomTfyANMo__Og+g9ruTxpc=JXLarVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 2, 2023 at 1:37 PM shiy(dot)fnst(at)fujitsu(dot)com
<shiy(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wed, Mar 1, 2023 9:22 PM Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
> >
> > > # Result
> > The time executing update (the average of 3 runs is taken, the unit is
> > milliseconds):
> >
> > Shi Yu, could it be possible for you to re-run the tests with some more runs, and share the average?
> > I suspect maybe your test results have a very small pool size, and some runs are making
> > the average slightly problematic.
> >
> > In my tests, I shared the total time, which is probably also fine.
> >
>
> Thanks for your reply, I re-tested (based on
> v25_0001_use_index_on_subs_when_pub_rep_ident_full.patch) and took the average
> of 100 runs. The results are as follows. The unit is milliseconds.
>
> case1
> sequential scan: 1348.57
> index scan: 3785.15
>
> case2
> sequential scan: 1350.26
> index scan: 1754.01
>
> case3
> sequential scan: 1350.13
> index scan: 1340.97
>
> There was still some degradation in the first two cases. There are some gaps in
> our test results. Some information about my test is as follows.
>
> a. Some parameters specified in postgresql.conf.
> shared_buffers = 8GB
> checkpoint_timeout = 30min
> max_wal_size = 20GB
> min_wal_size = 10GB
> autovacuum = off
>
> b. Executed SQL.
> I executed TRUNCATE and INSERT before each UPDATE. I am not sure if you did the
> same, or just executed 50 consecutive UPDATEs. If the latter one, there would be
> lots of old tuples and this might have a bigger impact on sequential scan. I
> tried this case (which executes 50 consecutive UPDATEs) and also saw that the
> overhead is smaller than before.
>
>
> Besides, I looked into the regression of this patch with `gprof`. Some results
> are as follows. I think with single buffer lock, sequential scan can scan
> multiple tuples (see heapgettup()), while index scan can only scan one tuple. So
> in case1, which has lots of duplicate values and more tuples need to be scanned,
> index scan takes longer time.
>
> - results of `gprof`
> case1:
> master
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 1.37 0.66 0.01 654312 0.00 0.00 LWLockAttemptLock
> 0.00 0.73 0.00 573358 0.00 0.00 LockBuffer
> 0.00 0.73 0.00 10014 0.00 0.06 heap_getnextslot
>
> patched
> % cumulative self self total
> time seconds seconds calls ms/call ms/call name
> 9.70 1.27 0.36 50531459 0.00 0.00 LWLockAttemptLock
> 3.23 2.42 0.12 100259200 0.00 0.00 LockBuffer
> 6.20 1.50 0.23 50015101 0.00 0.00 heapam_index_fetch_tuple
> 4.04 2.02 0.15 50015101 0.00 0.00 index_fetch_heap
> 1.35 3.21 0.05 10119 0.00 0.00 index_getnext_slot
>

In the above profile number of calls to index_fetch_heap(),
heapam_index_fetch_tuple() explains the reason for the regression you
are seeing with the index scan. Because the update will generate dead
tuples in the same transaction and those dead tuples won't be removed,
we get those from the index and then need to perform
index_fetch_heap() to find out whether the tuple is dead or not. Now,
for sequence scan also we need to scan those dead tuples but there we
don't need to do back-and-forth between index and heap. I think we can
once check with more number of tuples (say with 20000, 50000, etc.)
for case-1.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2023-03-02 09:20:09 Re: Minimal logical decoding on standbys
Previous Message Peter Eisentraut 2023-03-02 08:32:44 Re: Missing free_var() at end of accum_sum_final()?