RE: Perform streaming logical transactions by background workers and parallel apply

From: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-09-13 12:02:26
Message-ID: TYAPR01MB586674DEBF3947453914E305F5479@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Hou-san,

> I will dig your patch more, but I send partially to keep the activity of the thread.

More minor comments about v28.

===
About 0002

For 015_stream.pl

14. check_parallel_log

```
+# Check the log that the streamed transaction was completed successfully
+# reported by parallel apply worker.
+sub check_parallel_log
+{
+ my ($node_subscriber, $offset, $is_parallel)= @_;
+ my $parallel_message = 'finished processing the transaction finish command';
+
+ if ($is_parallel)
+ {
+ $node_subscriber->wait_for_log(qr/$parallel_message/, $offset);
+ }
+}
```

I think check_parallel_log() should be called only when streaming = 'parallel' and if-statement is not needed

===
For 016_stream_subxact.pl

15. test_streaming

```
+ INSERT INTO test_tab SELECT i, md5(i::text) FROM generate_series( 3, 500) s(i);
```

" 3" should be "3".

===
About 0003

For applyparallelworker.c

16. parallel_apply_relation_check()

```
+ if (rel->parallel_apply_safe == PARALLEL_APPLY_SAFETY_UNKNOWN)
+ logicalrep_rel_mark_parallel_apply(rel);
```

I was not clear when logicalrep_rel_mark_parallel_apply() is called here.
IIUC parallel_apply_relation_check() is called when parallel apply worker handles changes,
but before that relation is opened via logicalrep_rel_open() and parallel_apply_safe is set here.
If it guards some protocol violation, we may use Assert().

===
For create_subscription.sgml

17.
The restriction about foreign key does not seem to be documented.

===
About 0004

For 015_stream.pl

18. check_parallel_log

I heard that the removal has been reverted, but in the patch
check_parallel_log() is removed again... :-(

===
About throughout

I checked the test coverage via `make coverage`. About appluparallelworker.c and worker.c, both function coverage is 100%, and
line coverages are 86.2 % and 94.5 %. Generally it's good.
But I read the report and following parts seems not tested.

In parallel_apply_start_worker():

```
if (tmp_winfo->error_mq_handle == NULL)
{
/*
* Release the worker information and try next one if the parallel
* apply worker exited cleanly.
*/
ParallelApplyWorkersList = foreach_delete_current(ParallelApplyWorkersList, lc);
shm_mq_detach(tmp_winfo->mq_handle);
dsm_detach(tmp_winfo->dsm_seg);
pfree(tmp_winfo);
}
```

In HandleParallelApplyMessage():

```
case 'X': /* Terminate, indicating clean exit */
{
shm_mq_detach(winfo->error_mq_handle);
winfo->error_mq_handle = NULL;
break;
}
```

Does it mean that we do not test the termination of parallel apply worker? If so I think it should be tested.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message kuroda.hayato@fujitsu.com 2022-09-13 12:05:32 RE: Perform streaming logical transactions by background workers and parallel apply
Previous Message Andrey Lepikhov 2022-09-13 11:40:37 Re: [POC] Allow flattening of subquery with a link to upper query