RE: Perform streaming logical transactions by background workers and parallel apply

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-12-07 02:58:19
Message-ID: OS0PR01MB57166A377638E57823219E32941A9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tuesday, December 6, 2022 3:50 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Dec 5, 2022 at 9:59 AM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> >
> > Attach a new version patch set which fixed a testcase failure on CFbot.
> >
>
> Few comments:
> ============
> 1.
> + /*
> + * Break the loop if the parallel apply worker has finished applying
> + * the transaction. The parallel apply worker should have closed the
> + * file before committing.
> + */
> + if (am_parallel_apply_worker() &&
> + MyParallelShared->xact_state == PARALLEL_TRANS_FINISHED)
> + goto done;
>
> This looks hackish to me because ideally, this API should exit after reading and
> applying all the messages in the spool file. This check is primarily based on the
> knowledge that once we reach some state, the file won't have more data. I
> think it would be better to explicitly ensure the same.

I added a function to ensure that there is no message left after committing
the transaction.

> 2.
> + /*
> + * No need to output the DEBUG message here in the parallel apply
> + * worker as similar messages will be output when handling STREAM_STOP
> + * message.
> + */
> + if (!am_parallel_apply_worker() && nchanges % 1000 == 0)
> elog(DEBUG1, "replayed %d changes from file \"%s\"",
> nchanges, path);
> }
>
> I think this check appeared a bit ugly to me. I think it is okay to get a similar
> DEBUG message at another place (on stream_stop) because
> (a) this is logged every 1000 messages whereas stream_stop can be after many
> more messages, so there doesn't appear to be a direct correlation; (b) due to
> this, we can identify whether it is due to spooled messages or due to direct
> apply; ideally we can use another DEBUG message to differentiate but this
> doesn't appear bad to me.

OK, I removed this check.

> 3. The function names for serialize_stream_start(), serialize_stream_stop(), and
> serialize_stream_abort() don't seem to match the functionality they provide
> because none of these write/serialize changes to the file. Can we rename
> these? Some possible options could be stream_start_internal or
> stream_start_guts.

Renamed to stream_start_internal().

Attach the new version patch set which addressed above comments.
I also attach a new patch to force stream change(provided by Shi-san) and
another one that introduce a GUC stream_serialize_threshold (provided by
Kuroda-san and Shi-san) which can help testing the patch set.

Besides, I fixed a bug where there could still be messages left in memory
queue and the PA has started to apply spooled message.

Best regards,
Hou zj

Attachment Content-Type Size
v57-0007-Add-a-main_worker_pid-to-pg_stat_subscription.patch application/octet-stream 8.7 KB
v57-0002-Serialize-partial-changes-to-a-file-when-the-att.patch application/octet-stream 42.4 KB
v57-0001-Perform-streaming-logical-transactions-by-parall.patch application/octet-stream 194.9 KB
v57-0003-Test-streaming-parallel-option-in-tap-test.patch application/octet-stream 80.1 KB
v57-0004-Add-GUC-stream_serialize_threshold-and-test-seri.patch application/octet-stream 9.5 KB
v57-0005-Allow-streaming-every-change-without-waiting-til.patch application/octet-stream 5.3 KB
v57-0006-Retry-to-apply-streaming-xact-only-in-apply-work.patch application/octet-stream 22.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-12-07 02:59:43 Re: Time delayed LR (WAS Re: logical replication restrictions)
Previous Message David G. Johnston 2022-12-07 02:57:30 Re: [DOCS] Stats views and functions not in order?