| From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
|---|---|
| To: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
| Cc: | "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Parallel Apply |
| Date: | 2025-11-24 11:36:53 |
| Message-ID: | CAA4eK1KbSOcU2FER=F_nd0ghSeHdGeT=4U4n=dJTRPyCM7ezBA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Nov 24, 2025 at 9:56 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Sep 16, 2025 at 3:03 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Sat, Sep 6, 2025 at 10:33 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> > > I suspect this might not be the most performant default strategy and
> > > could frequently cause a performance dip. In general, we utilize
> > > parallel apply workers, considering that the time taken to apply
> > > changes is much costlier than reading and sending messages to workers.
> > >
> > > The current strategy involves the leader picking one transaction for
> > > itself after distributing transactions to all apply workers, assuming
> > > the apply task will take some time to complete. When the leader takes
> > > on an apply task, it becomes a bottleneck for complete parallelism.
> > > This is because it needs to finish applying previous messages before
> > > accepting any new ones. Consequently, even as workers slowly become
> > > free, they won't receive new tasks because the leader is busy applying
> > > its own transaction.
> > >
> > > This type of strategy might be suitable in scenarios where users
> > > cannot supply more workers due to resource limitations. However, on
> > > high-end machines, it is more efficient to let the leader act solely
> > > as a message transmitter and allow the apply workers to handle all
> > > apply tasks. This could be a configurable parameter, determining
> > > whether the leader also participates in applying changes. I believe
> > > this should not be the default strategy; in fact, the default should
> > > be for the leader to act purely as a transmitter.
> > >
> >
> > I see your point but consider a scenario where we have two pa workers.
> > pa-1 is waiting for some backend on unique_key insertion and pa-2 is
> > waiting for pa-1 to complete its transaction as pa-2 has to perform
> > some change which is dependent on pa-1's transaction. So, leader can
> > either simply wait for a third transaction to be distributed or just
> > apply it and process another change. If we follow the earlier then it
> > is quite possible that the sender fills the network queue to send data
> > and simply timed out.
>
> Sorry I took a while to come back to this. I understand your point and
> agree that it's a valid concern. However, I question whether limiting
> this to a single choice is the optimal solution. The core issue
> involves two distinct roles: work distribution and applying changes.
> Work distribution is exclusively handled by the leader, while any
> worker can apply the changes. This is essentially a single-producer,
> multiple-consumer problem.
>
> While it might seem efficient for the producer (leader) to assist
> consumers (workers) when there's a limited number of consumers, I
> believe this isn't the best design. In such scenarios, it's generally
> better to allow the producer to focus solely on its primary task,
> unless there's a severe shortage of processing power.
>
> If computing resources are constrained, allowing producers to join
> consumers in applying changes is acceptable. However, if sufficient
> processing power is available, the producer should ideally be left to
> its own duties. The question then becomes: how do we make this
> decision?
>
> My suggestion is to make this a configurable parameter. Users could
> then decide whether the leader participates in applying changes.
>
We could do this but another possibility is that the leader does
distribute some threshold of pending transactions (say 5 or 10) to
each of the workers and if none of the workers is still available then
it can perform the task by itself. I think this will avoid the system
performing poorly when the existing workers are waiting on each other
and or backend to finish the current transaction. Having said that, I
think this can be done as a separate optimization patch as well.
--
With Regards,
Amit Kapila.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrey Borodin | 2025-11-24 11:40:15 | Re: IPC/MultixactCreation on the Standby server |
| Previous Message | Dave Page | 2025-11-24 11:33:12 | Re: Build failure with Meson >= 1.8.3 on Windows |