From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> |
Cc: | "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Subject: | RE: Perform streaming logical transactions by background workers and parallel apply |
Date: | 2022-11-04 07:45:18 |
Message-ID: | TYAPR01MB586607E3786DC241054DA7F2F53B9@TYAPR01MB5866.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear Hou,
Thank you for updating the patch!
While testing yours, I found that the leader apply worker has been crashed in the following case.
I will dig the failure more, but I reported here for records.
1. Change macros for forcing to write a temporary file.
```
-#define CHANGES_THRESHOLD 1000
-#define SHM_SEND_TIMEOUT_MS 10000
+#define CHANGES_THRESHOLD 10
+#define SHM_SEND_TIMEOUT_MS 100
```
2. Set logical_decoding_work_mem to 64kB on publisher
3. Insert huge data on publisher
```
publisher=# \d tbl
Table "public.tbl"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
c | integer | | |
Publications:
"pub"
publisher=# BEGIN;
BEGIN
publisher=*# INSERT INTO tbl SELECT i FROM generate_series(1, 5000000) s(i);
INSERT 0 5000000
publisher=*# COMMIT;
```
-> LA crashes on subscriber! Followings are the backtrace.
```
(gdb) bt
#0 0x00007f2663ae4387 in raise () from /lib64/libc.so.6
#1 0x00007f2663ae5a78 in abort () from /lib64/libc.so.6
#2 0x0000000000ad0a95 in ExceptionalCondition (conditionName=0xcabdd0 "mqh->mqh_partial_bytes <= nbytes",
fileName=0xcabc30 "../src/backend/storage/ipc/shm_mq.c", lineNumber=420) at ../src/backend/utils/error/assert.c:66
#3 0x00000000008eaeb7 in shm_mq_sendv (mqh=0x271ebd8, iov=0x7ffc664a2690, iovcnt=1, nowait=false, force_flush=true)
at ../src/backend/storage/ipc/shm_mq.c:420
#4 0x00000000008eac5a in shm_mq_send (mqh=0x271ebd8, nbytes=1, data=0x271f3c0, nowait=false, force_flush=true)
at ../src/backend/storage/ipc/shm_mq.c:338
#5 0x0000000000880e18 in parallel_apply_free_worker (winfo=0x271f270, xid=735, stop_worker=true)
at ../src/backend/replication/logical/applyparallelworker.c:368
#6 0x00000000008a3638 in apply_handle_stream_commit (s=0x7ffc664a2790) at ../src/backend/replication/logical/worker.c:2081
#7 0x00000000008a54da in apply_dispatch (s=0x7ffc664a2790) at ../src/backend/replication/logical/worker.c:3195
#8 0x00000000008a5a76 in LogicalRepApplyLoop (last_received=378674872) at ../src/backend/replication/logical/worker.c:3431
#9 0x00000000008a72ac in start_apply (origin_startpos=0) at ../src/backend/replication/logical/worker.c:4245
#10 0x00000000008a7d77 in ApplyWorkerMain (main_arg=0) at ../src/backend/replication/logical/worker.c:4555
#11 0x000000000084983c in StartBackgroundWorker () at ../src/backend/postmaster/bgworker.c:861
#12 0x0000000000854192 in do_start_bgworker (rw=0x26c0d20) at ../src/backend/postmaster/postmaster.c:5801
#13 0x000000000085457c in maybe_start_bgworkers () at ../src/backend/postmaster/postmaster.c:6025
#14 0x000000000085350b in sigusr1_handler (postgres_signal_arg=10) at ../src/backend/postmaster/postmaster.c:5182
#15 <signal handler called>
#16 0x00007f2663ba3b23 in __select_nocancel () from /lib64/libc.so.6
#17 0x000000000084edbc in ServerLoop () at ../src/backend/postmaster/postmaster.c:1768
#18 0x000000000084e737 in PostmasterMain (argc=3, argv=0x2690f60) at ../src/backend/postmaster/postmaster.c:1476
#19 0x000000000074adfb in main (argc=3, argv=0x2690f60) at ../src/backend/main/main.c:197
```
PSA the script that can reproduce the failure on my environment.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachment | Content-Type | Size |
---|---|---|
repro.sh | application/octet-stream | 1.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2022-11-04 08:06:42 | Re: Perform streaming logical transactions by background workers and parallel apply |
Previous Message | Rahila Syed | 2022-11-04 07:36:54 | Re: Allow single table VACUUM in transaction block |