RE: Perform streaming logical transactions by background workers and parallel apply

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Subject: RE: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-11-04 07:45:18
Message-ID: TYAPR01MB586607E3786DC241054DA7F2F53B9@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Hou,

Thank you for updating the patch!
While testing yours, I found that the leader apply worker has been crashed in the following case.
I will dig the failure more, but I reported here for records.

1. Change macros for forcing to write a temporary file.

```
-#define CHANGES_THRESHOLD 1000
-#define SHM_SEND_TIMEOUT_MS 10000
+#define CHANGES_THRESHOLD 10
+#define SHM_SEND_TIMEOUT_MS 100
```

2. Set logical_decoding_work_mem to 64kB on publisher

3. Insert huge data on publisher

```
publisher=# \d tbl
Table "public.tbl"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
c | integer | | |
Publications:
"pub"

publisher=# BEGIN;
BEGIN
publisher=*# INSERT INTO tbl SELECT i FROM generate_series(1, 5000000) s(i);
INSERT 0 5000000
publisher=*# COMMIT;
```

-> LA crashes on subscriber! Followings are the backtrace.

```
(gdb) bt
#0 0x00007f2663ae4387 in raise () from /lib64/libc.so.6
#1 0x00007f2663ae5a78 in abort () from /lib64/libc.so.6
#2 0x0000000000ad0a95 in ExceptionalCondition (conditionName=0xcabdd0 "mqh->mqh_partial_bytes <= nbytes",
fileName=0xcabc30 "../src/backend/storage/ipc/shm_mq.c", lineNumber=420) at ../src/backend/utils/error/assert.c:66
#3 0x00000000008eaeb7 in shm_mq_sendv (mqh=0x271ebd8, iov=0x7ffc664a2690, iovcnt=1, nowait=false, force_flush=true)
at ../src/backend/storage/ipc/shm_mq.c:420
#4 0x00000000008eac5a in shm_mq_send (mqh=0x271ebd8, nbytes=1, data=0x271f3c0, nowait=false, force_flush=true)
at ../src/backend/storage/ipc/shm_mq.c:338
#5 0x0000000000880e18 in parallel_apply_free_worker (winfo=0x271f270, xid=735, stop_worker=true)
at ../src/backend/replication/logical/applyparallelworker.c:368
#6 0x00000000008a3638 in apply_handle_stream_commit (s=0x7ffc664a2790) at ../src/backend/replication/logical/worker.c:2081
#7 0x00000000008a54da in apply_dispatch (s=0x7ffc664a2790) at ../src/backend/replication/logical/worker.c:3195
#8 0x00000000008a5a76 in LogicalRepApplyLoop (last_received=378674872) at ../src/backend/replication/logical/worker.c:3431
#9 0x00000000008a72ac in start_apply (origin_startpos=0) at ../src/backend/replication/logical/worker.c:4245
#10 0x00000000008a7d77 in ApplyWorkerMain (main_arg=0) at ../src/backend/replication/logical/worker.c:4555
#11 0x000000000084983c in StartBackgroundWorker () at ../src/backend/postmaster/bgworker.c:861
#12 0x0000000000854192 in do_start_bgworker (rw=0x26c0d20) at ../src/backend/postmaster/postmaster.c:5801
#13 0x000000000085457c in maybe_start_bgworkers () at ../src/backend/postmaster/postmaster.c:6025
#14 0x000000000085350b in sigusr1_handler (postgres_signal_arg=10) at ../src/backend/postmaster/postmaster.c:5182
#15 <signal handler called>
#16 0x00007f2663ba3b23 in __select_nocancel () from /lib64/libc.so.6
#17 0x000000000084edbc in ServerLoop () at ../src/backend/postmaster/postmaster.c:1768
#18 0x000000000084e737 in PostmasterMain (argc=3, argv=0x2690f60) at ../src/backend/postmaster/postmaster.c:1476
#19 0x000000000074adfb in main (argc=3, argv=0x2690f60) at ../src/backend/main/main.c:197
```

PSA the script that can reproduce the failure on my environment.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
repro.sh application/octet-stream 1.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-11-04 08:06:42 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Rahila Syed 2022-11-04 07:36:54 Re: Allow single table VACUUM in transaction block