From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
---|---|
To: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Perform streaming logical transactions by background workers and parallel apply |
Date: | 2023-01-06 05:54:25 |
Message-ID: | CAFiTN-sjpOm50q8SfoywuKQAqJAaE9FtjvwHBWHvi+pn39OWQQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 6, 2023 at 9:37 AM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Thursday, January 5, 2023 7:54 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> Thanks, I have started another thread[1]
>
> Attach the parallel apply patch set here again. I didn't change the patch set,
> attach it here just to let the CFbot keep testing it.
I have completed the review and some basic testing and it mostly looks
fine to me. Here is my last set of comments/suggestions.
1.
/*
* Don't start a new parallel worker if user has set skiplsn as it's
* possible that they want to skip the streaming transaction. For
* streaming transactions, we need to serialize the transaction to a file
* so that we can get the last LSN of the transaction to judge whether to
* skip before starting to apply the change.
*/
if (!XLogRecPtrIsInvalid(MySubscription->skiplsn))
return false;
I think this is fine to block parallelism in this case, but it is also
possible to make it less restrictive, basically, only if the first lsn
of the transaction is <= skiplsn, then only it is possible that the
final_lsn might match with skiplsn otherwise that is not possible. And
if we want then we can allow parallelism in that case.
I understand that currently we do not have first_lsn of the
transaction in stream start message but I think that should be easy to
do? Although I am not sure if it is worth it, it's good to make a
note at least.
2.
+ * XXX Additionally, we also stop the worker if the leader apply worker
+ * serialize part of the transaction data due to a send timeout. This is
+ * because the message could be partially written to the queue and there
+ * is no way to clean the queue other than resending the message until it
+ * succeeds. Instead of trying to send the data which anyway would have
+ * been serialized and then letting the parallel apply worker deal with
+ * the spurious message, we stop the worker.
+ */
+ if (winfo->serialize_changes ||
+ list_length(ParallelApplyWorkerPool) >
+ (max_parallel_apply_workers_per_subscription / 2))
IMHO this reason (XXX Additionally, we also stop the worker if the
leader apply worker serialize part of the transaction data due to a
send timeout) for stopping the worker looks a bit hackish to me. It
may be a rare case so I am not talking about the performance but the
reasoning behind stopping is not good. Ideally we should be able to
clean up the message queue and reuse the worker.
3.
+ else if (shmq_res == SHM_MQ_WOULD_BLOCK)
+ {
+ /* Replay the changes from the file, if any. */
+ if (pa_has_spooled_message_pending())
+ {
+ pa_spooled_messages();
+ }
I think we do not need this pa_has_spooled_message_pending() function.
Because this function is just calling pa_get_fileset_state() which is
acquiring mutex and getting filestate then if the filestate is not
FS_EMPTY then we call pa_spooled_messages() that will again call
pa_get_fileset_state() which will again acquire mutex. I think when
the state is FS_SERIALIZE_IN_PROGRESS it will frequently call
pa_get_fileset_state() consecutively 2 times, and I think we can
easily achieve the same behavior with just one call.
4.
+ * leader, or when there there is an error. None of these cases will allow
+ * the code to reach here.
/when there there is an error/when there is an error
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2023-01-06 05:54:47 | Re: Add a new pg_walinspect function to extract FPIs from WAL records |
Previous Message | vignesh C | 2023-01-06 05:50:09 | Re: TAP output format in pg_regress |