Re: Perform streaming logical transactions by background workers and parallel apply

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-12-15 05:21:11
Message-ID: CAD21AoD-tzzWCs5CVuxiHnSDX8RFXBfP_C7m_SkVXhyhLHRR=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 14, 2022 at 3:48 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Dec 14, 2022 at 9:50 AM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > On Tuesday, December 13, 2022 11:25 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > Here are comments on v59 0001, 0002 patches:
> >
> > Thanks for the comments!
> >
> > > +void
> > > +pa_increment_stream_block(ParallelApplyWorkerShared *wshared) {
> > > + while (1)
> > > + {
> > > + SpinLockAcquire(&wshared->mutex);
> > > +
> > > + /*
> > > + * Don't try to increment the count if the parallel
> > > apply worker is
> > > + * taking the stream lock. Otherwise, there would be
> > > a race condition
> > > + * that the parallel apply worker checks there is no
> > > pending streaming
> > > + * block and before it actually starts waiting on a
> > > lock, the leader
> > > + * sends another streaming block and take the stream
> > > lock again. In
> > > + * this case, the parallel apply worker will start
> > > waiting for the next
> > > + * streaming block whereas there is actually a
> > > pending streaming block
> > > + * available.
> > > + */
> > > + if (!wshared->pa_wait_for_stream)
> > > + {
> > > + wshared->pending_stream_count++;
> > > + SpinLockRelease(&wshared->mutex);
> > > + break;
> > > + }
> > > +
> > > + SpinLockRelease(&wshared->mutex);
> > > + }
> > > +}
> > >
> > > I think we should add an assertion to check if we don't hold the stream lock.
> > >
> > > I think that waiting for pa_wait_for_stream to be false in a busy loop is not a
> > > good idea. It's not interruptible and there is not guarantee that we can break
> > > from this loop in a short time. For instance, if PA executes
> > > pa_decr_and_wait_stream_block() a bit earlier than LA executes
> > > pa_increment_stream_block(), LA has to wait for PA to acquire and release the
> > > stream lock in a busy loop. It should not be long in normal cases but the
> > > duration LA needs to wait for PA depends on PA, which could be long. Also
> > > what if PA raises an error in
> > > pa_lock_stream() due to some reasons? I think LA won't be able to detect the
> > > failure.
> > >
> > > I think we should at least make it interruptible and maybe need to add some
> > > sleep. Or perhaps we can use the condition variable for this case.
> >
>
> Or we can leave this while (true) logic altogether for the first
> version and have a comment to explain this race. Anyway, after
> restarting, it will probably be solved. We can always change this part
> of the code later if this really turns out to be problematic.
>

+1. Thank you Hou-san for adding this comment in the latest version (v61) patch!

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2022-12-15 05:26:39 The drop-index-concurrently-1 isolation test no longer tests what it was meant to
Previous Message Amul Sul 2022-12-15 05:19:45 Re: Error-safe user functions