Re: Perform streaming logical transactions by background workers and parallel apply

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-01-06 07:29:22
Message-ID: CAFiTN-uhnJkrUsQByBHMK268T-GEx_D8DQ6b2T+aW6RiU75pbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 6, 2023 at 12:05 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>

>
> Yeah, I also don't think sending extra eight bytes with stream_start
> message is worth it. But it is fine to mention the same in the
> comments.

Right.

> > 2.
> >
> > + * XXX Additionally, we also stop the worker if the leader apply worker
> > + * serialize part of the transaction data due to a send timeout. This is
> > + * because the message could be partially written to the queue and there
> > + * is no way to clean the queue other than resending the message until it
> > + * succeeds. Instead of trying to send the data which anyway would have
> > + * been serialized and then letting the parallel apply worker deal with
> > + * the spurious message, we stop the worker.
> > + */
> > + if (winfo->serialize_changes ||
> > + list_length(ParallelApplyWorkerPool) >
> > + (max_parallel_apply_workers_per_subscription / 2))
> >
> > IMHO this reason (XXX Additionally, we also stop the worker if the
> > leader apply worker serialize part of the transaction data due to a
> > send timeout) for stopping the worker looks a bit hackish to me. It
> > may be a rare case so I am not talking about the performance but the
> > reasoning behind stopping is not good. Ideally we should be able to
> > clean up the message queue and reuse the worker.
> >
>
> TBH, I don't know what is the better way to deal with this with the
> current infrastructure. I thought we can do this as a separate
> enhancement in the future.

Okay.

> > 3.
> > + else if (shmq_res == SHM_MQ_WOULD_BLOCK)
> > + {
> > + /* Replay the changes from the file, if any. */
> > + if (pa_has_spooled_message_pending())
> > + {
> > + pa_spooled_messages();
> > + }
> >
> > I think we do not need this pa_has_spooled_message_pending() function.
> > Because this function is just calling pa_get_fileset_state() which is
> > acquiring mutex and getting filestate then if the filestate is not
> > FS_EMPTY then we call pa_spooled_messages() that will again call
> > pa_get_fileset_state() which will again acquire mutex. I think when
> > the state is FS_SERIALIZE_IN_PROGRESS it will frequently call
> > pa_get_fileset_state() consecutively 2 times, and I think we can
> > easily achieve the same behavior with just one call.
> >
>
> This is just to keep the code easy to follow. As this would be a rare
> case, so thought of giving preference to code clarity.

I think the code will be simpler with just one function no? I mean
instead of calling pa_has_spooled_message_pending() in if condition
what if we directly call pa_spooled_messages();, this is anyway
fetching the file_state and if the filestate is EMPTY then it can
return false, and if it returns false we can execute the code which is
there in else condition. We might need to change the name of the
function though.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2023-01-06 07:31:02 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Masahiko Sawada 2023-01-06 07:28:42 Add progress reporting to pg_verifybackup