RE: Perform streaming logical transactions by background workers and parallel apply

From: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-07-07 03:44:04
Message-ID: OS3PR01MB62755C6C9A75EB09F7218B589E839@OS3PR01MB6275.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 1, 2022 at 14:43 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> Below are some review comments for patches v14-0001, and v14-0002:

Thanks for your comments.

> 1.10 .../replication/logical/applybgworker.c - apply_bgworker_find
>
> + /*
> + * Find entry for requested transaction.
> + */
> + entry = hash_search(ApplyWorkersHash, &xid, HASH_FIND, &found);
> + if (found)
> + {
> + entry->wstate->pstate->status = APPLY_BGWORKER_BUSY;
> + return entry->wstate;
> + }
> + else
> + return NULL;
> +}
>
> IMO it is an unexpected side-effect for the function called "find" to
> be also modifying the thing that it found. IMO this setting BUSY
> should either be done by the caller, or else this function name should
> be renamed to make it obvious that this is doing more than just
> "finding" something.

Since we set the state to BUSY in the function apply_bgworker_start and the
state is not modified (set to FINISHED) until the transaction completes, I
think we do not need to set this state to BUSY again in the function
apply_bgworker_find during applying the transaction.
So I removed it and invoked function Assert.
I also invoked function Assert in function apply_bgworker_start.

> 1.16. src/backend/replication/logical/launcher.c - logicalrep_worker_launch
>
> + bool is_subworker = (subworker_dsm != DSM_HANDLE_INVALID);
> +
> + /* We don't support table sync in subworker */
> + Assert(!(is_subworker && OidIsValid(relid)));
>
> I'm not sure the comment is good. It sounds like it is something that
> might be possible but is just current "not supported". In fact, I
> thought this is really just a sanity check because the combination of
> those params is just plain wrong isn't it? Maybe a better comment is
> just:
> /* Sanity check */

Improved this comment as following:
```
/* Sanity check : we don't support table sync in subworker. */
```

> 1.22 src/backend/replication/logical/worker.c - skip_xact_finish_lsn
>
> /*
> * We enable skipping all data modification changes (INSERT, UPDATE, etc.) for
> * the subscription if the remote transaction's finish LSN matches
> the subskiplsn.
> * Once we start skipping changes, we don't stop it until we skip all
> changes of
> * the transaction even if pg_subscription is updated and
> MySubscription->skiplsn
> - * gets changed or reset during that. Also, in streaming transaction cases, we
> - * don't skip receiving and spooling the changes since we decide whether or not
> + * gets changed or reset during that. Also, in streaming transaction
> cases (streaming = on),
> + * we don't skip receiving and spooling the changes since we decide
> whether or not
> * to skip applying the changes when starting to apply changes. The
> subskiplsn is
> * cleared after successfully skipping the transaction or applying non-empty
> * transaction. The latter prevents the mistakenly specified subskiplsn from
> - * being left.
> + * being left. Note that we cannot skip the streaming transaction in parallel
> + * mode, because we cannot get the finish LSN before applying the changes.
> */
>
> "in parallel mode, because" -> "in 'streaming = parallel' mode, because"

Not sure about this.

> 1.28 src/backend/replication/logical/worker.c - apply_handle_stream_prepare
>
> + if (wstate)
> + {
> + apply_bgworker_send_data(wstate, s->len, s->data);
> +
> + /*
> + * Wait for apply background worker to finish. This is required to
> + * maintain commit order which avoids failures due to transaction
> + * dependencies and deadlocks.
> + */
> + apply_bgworker_wait_for(wstate, APPLY_BGWORKER_FINISHED);
> + apply_bgworker_free(wstate);
>
> I think maybe the comment can be changed slightly, and then it can
> move up one line to the top of this code block (above the 3
> statements). I think it will become more readable.
>
> SUGGESTION
> After sending the data to the apply background worker, wait for that
> worker to finish. This is necessary to maintain commit order which
> avoids failures due to transaction dependencies and deadlocks.

I think it might be better to add a new comment before invoking function
apply_bgworker_send_data. Improve the comments as you suggested.
I improved this point in function apply_handle_stream_prepare,
apply_handle_stream_abort and apply_handle_stream_commit. What do you think
about changing it like this:
```
/* Send STREAM PREPARE message to the apply background worker. */
apply_bgworker_send_data(wstate, s->len, s->data);

/*
* After sending the data to the apply background worker, wait for
* that worker to finish. This is necessary to maintain commit
* order which avoids failures due to transaction dependencies and
* deadlocks.
*/
apply_bgworker_wait_for(wstate, APPLY_BGWORKER_FINISHED);
```

> 1.34 src/backend/replication/logical/worker.c - apply_dispatch
>
> -
> /*
> * Logical replication protocol message dispatcher.
> */
> -static void
> +void
> apply_dispatch(StringInfo s)
>
> Maybe removing the whitespace is not really needed as part of this patch?

Yes, this change is not necessary for this patch.
But since this change does not involve the modification of comments and actual
code, it just adjusts the blank line between the function modified by this
patch and the previous function, so I think it is okay in this patch.

> 2.1 Commit message
>
> Change all TAP tests using the SUBSCRIPTION "streaming" option, so they
> now test both 'on' and 'parallel' values.
>
> "option" -> "parameter"

Sorry I missed this point when I was merging the patches. I merged this change
in v15.

Attach the new patches.
Also improved the patches as suggested in [1], [2] and [3].

[1] - https://www.postgresql.org/message-id/CAA4eK1KgovaRcbSuzzWki1HVso6oLAdZ2aPr1nWxX1x%3DVDBQJg%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAHut%2BPtRNAOwFtBp_TnDWdC7UpcTxPJzQnrm%3DNytN7cVBt5zRQ%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/CAHut%2BPvrw%2BtgCEYGxv%2BnKrqg-zbJdYEXee6o4irPAsYoXcuUcw%40mail.gmail.com

Regards,
Wang wei

Attachment Content-Type Size
v15-0001-Perform-streaming-logical-transactions-by-backgr.patch application/octet-stream 105.6 KB
v15-0002-Test-streaming-parallel-option-in-tap-test.patch application/octet-stream 69.1 KB
v15-0003-Add-some-checks-before-using-apply-background-wo.patch application/octet-stream 36.6 KB
v15-0004-Retry-to-apply-streaming-xact-only-in-apply-work.patch application/octet-stream 26.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message wangw.fnst@fujitsu.com 2022-07-07 03:45:31 RE: Perform streaming logical transactions by background workers and parallel apply
Previous Message shiy.fnst@fujitsu.com 2022-07-07 03:31:57 RE: Perform streaming logical transactions by background workers and parallel apply