Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Önder Kalacı <onderkalaci(at)gmail(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2023-03-06 11:32:38
Message-ID: CAA4eK1+B+dcieiyafG2yXo0Y-a1Sj6BSSbf9paUR300jdowajg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 2, 2023 at 2:45 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Mar 2, 2023 at 1:37 PM shiy(dot)fnst(at)fujitsu(dot)com
> <shiy(dot)fnst(at)fujitsu(dot)com> wrote:
> > - results of `gprof`
> > case1:
> > master
> > % cumulative self self total
> > time seconds seconds calls ms/call ms/call name
> > 1.37 0.66 0.01 654312 0.00 0.00 LWLockAttemptLock
> > 0.00 0.73 0.00 573358 0.00 0.00 LockBuffer
> > 0.00 0.73 0.00 10014 0.00 0.06 heap_getnextslot
> >
> > patched
> > % cumulative self self total
> > time seconds seconds calls ms/call ms/call name
> > 9.70 1.27 0.36 50531459 0.00 0.00 LWLockAttemptLock
> > 3.23 2.42 0.12 100259200 0.00 0.00 LockBuffer
> > 6.20 1.50 0.23 50015101 0.00 0.00 heapam_index_fetch_tuple
> > 4.04 2.02 0.15 50015101 0.00 0.00 index_fetch_heap
> > 1.35 3.21 0.05 10119 0.00 0.00 index_getnext_slot
> >
>
> In the above profile number of calls to index_fetch_heap(),
> heapam_index_fetch_tuple() explains the reason for the regression you
> are seeing with the index scan. Because the update will generate dead
> tuples in the same transaction and those dead tuples won't be removed,
> we get those from the index and then need to perform
> index_fetch_heap() to find out whether the tuple is dead or not. Now,
> for sequence scan also we need to scan those dead tuples but there we
> don't need to do back-and-forth between index and heap. I think we can
> once check with more number of tuples (say with 20000, 50000, etc.)
> for case-1.
>

Andres, do you have any thoughts on this? We seem to have figured out
the cause of regression in the case Shi-San has reported and others
also agree with it. We can't think of doing anything better than what
the patch currently is doing, so thought of going with an option to
allow users to disable index scans. The current understanding is that
the patch will be a win in much more cases than the cases where one
can see regression but still having a knob could be useful in those
few cases.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-03-06 11:56:42 RE: [Proposal] Add foreign-server health checks infrastructure
Previous Message Önder Kalacı 2023-03-06 11:18:39 Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher