RE: logical replication restrictions

From: "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'Euler Taveira' <euler(at)eulerto(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Smith <smithpb2250(at)gmail(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Subject: RE: logical replication restrictions
Date: 2022-09-14 11:10:45
Message-ID: TYAPR01MB586624683ACD766B967F87F2F5469@TYAPR01MB5866.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Euler,

Thank you for making the patch! I'm also interested in the patch so I want to join the thread.

While testing your patch, I noticed that the 032_apply_delay.pl failed.
PSA logs that generated on my machine. This failure is same as reported by cfbot[1].

It seemed that the apply worker could not exit and starts WaitLatch() again even if the subscription had been disabled.
Followings are cited from attached log.

```
...
2022-09-14 09:44:30.489 UTC [14880] 032_apply_delay.pl LOG: statement: ALTER SUBSCRIPTION tap_sub SET (min_apply_delay = 86460000)
2022-09-14 09:44:30.525 UTC [14777] DEBUG: sending feedback (force 0) to recv 0/1690220, write 0/1690220, flush 0/1690220
2022-09-14 09:44:30.526 UTC [14759] DEBUG: server process (PID 14878) exited with exit code 0
2022-09-14 09:44:30.535 UTC [14777] DEBUG: logical replication apply delay: 86460000 ms
2022-09-14 09:44:30.535 UTC [14777] CONTEXT: processing remote data for replication origin "pg_16393" during "BEGIN" in transaction 734 finished at 0/16902A8
2022-09-14 09:44:30.576 UTC [14759] DEBUG: forked new backend, pid=14884 socket=6
2022-09-14 09:44:30.578 UTC [14759] DEBUG: server process (PID 14880) exited with exit code 0
2022-09-14 09:44:30.583 UTC [14884] 032_apply_delay.pl LOG: statement: ALTER SUBSCRIPTION tap_sub DISABLE
2022-09-14 09:44:30.589 UTC [14777] DEBUG: logical replication apply delay: 86459945 ms
2022-09-14 09:44:30.589 UTC [14777] CONTEXT: processing remote data for replication origin "pg_16393" during "BEGIN" in transaction 734 finished at 0/16902A8
2022-09-14 09:44:30.608 UTC [14759] DEBUG: forked new backend, pid=14886 socket=6
2022-09-14 09:44:30.632 UTC [14886] 032_apply_delay.pl LOG: statement: SELECT count(1) = 0 FROM pg_stat_subscription WHERE subname = 'tap_sub' AND pid IS NOT NULL;
2022-09-14 09:44:30.665 UTC [14759] DEBUG: server process (PID 14884) exited with exit code 0
...
```

I think this may be caused because the delayed worker will not read the modified catalog even if ALTER SUBSCRIPTION ... DISABLED is called.
I also attached the fix patch that can be applied after yours. It seems OK on my env.

[1]: https://cirrus-ci.com/task/4888001967816704

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Attachment Content-Type Size
FAILED_regress_log_032_apply_delay application/octet-stream 7.2 KB
FAILED_032_apply_delay_publisher.log application/octet-stream 17.4 KB
FAILED_032_apply_delay_subscriber.log application/octet-stream 631.4 KB
0002-Make-apply-workers-to-read-pg_subscription.patch application/octet-stream 1.2 KB
v7-0001-Time-delayed-logical-replication-subscriber.patch application/octet-stream 63.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Önder Kalacı 2022-09-14 12:04:00 Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Previous Message Alvaro Herrera 2022-09-14 10:42:42 Re: Inconsistencies in error messages