Re: Perform streaming logical transactions by background workers and parallel apply

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-01-05 08:21:53
Message-ID: CAFiTN-sTYk=h75Jn1a7ee+5hOcdQFjKGDvF_0NWQQXmoBv4A+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 5, 2023 at 9:07 AM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, January 4, 2023 9:29 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:

> > I think this looks good to me.
>
> Thanks for the comments.
> Attach the new version patch set which changed the comments as suggested.

Thanks for the updated patch, while testing this I see one strange
behavior which seems like bug to me, here is the step to reproduce

1. start 2 servers(config: logical_decoding_work_mem=64kB)
./pg_ctl -D data/ -c -l pub_logs start
./pg_ctl -D data1/ -c -l sub_logs start

2. Publisher:
create table t(a int PRIMARY KEY ,b text);
CREATE OR REPLACE FUNCTION large_val() RETURNS TEXT LANGUAGE SQL AS
'select array_agg(md5(g::text))::text from generate_series(1, 256) g';
create publication test_pub for table t
with(PUBLISH='insert,delete,update,truncate');
alter table t replica identity FULL ;
insert into t values (generate_series(1,2000),large_val()) ON CONFLICT
(a) DO UPDATE SET a=EXCLUDED.a*300;

3. Subscription Server:
create table t(a int,b text);
create subscription test_sub CONNECTION 'host=localhost port=5432
dbname=postgres' PUBLICATION test_pub WITH ( slot_name =
test_slot_sub1,streaming=parallel);

4. Publication Server:
begin ;
savepoint a;
delete from t;
savepoint b;
insert into t values (generate_series(1,5000),large_val()) ON CONFLICT
(a) DO UPDATE SET a=EXCLUDED.a*30000; -- (while executing this start
publisher in 2-3 secs)

Restart the publication server, while the transaction is still in an
uncommitted state.
./pg_ctl -D data/ -c -l pub_logs stop -mi
./pg_ctl -D data/ -c -l pub_logs start -mi

after this, the parallel apply worker stuck in waiting on stream lock
forever (even after 10 mins) -- see below, from subscriber logs I can
see one of the parallel apply worker [75677] started but never
finished [no error], after that I have performed more operation [same
insert] which got applied by new parallel apply worked which got
started and finished within 1 second.

dilipku+ 75660 1 0 13:39 ? 00:00:00
/home/dilipkumar/work/PG/install/bin/postgres -D data
dilipku+ 75661 75660 0 13:39 ? 00:00:00 postgres: checkpointer
dilipku+ 75662 75660 0 13:39 ? 00:00:00 postgres: background writer
dilipku+ 75664 75660 0 13:39 ? 00:00:00 postgres: walwriter
dilipku+ 75665 75660 0 13:39 ? 00:00:00 postgres: autovacuum launcher
dilipku+ 75666 75660 0 13:39 ? 00:00:00 postgres: logical
replication launcher
dilipku+ 75675 75595 0 13:39 ? 00:00:00 postgres: logical
replication apply worker for subscription 16389
dilipku+ 75676 75660 0 13:39 ? 00:00:00 postgres: walsender
dilipkumar postgres ::1(42192) START_REPLICATION
dilipku+ 75677 75595 0 13:39 ? 00:00:00 postgres: logical
replication parallel apply worker for subscription 16389 waiting

Subscriber logs:
2023-01-05 13:39:07.261 IST [75595] LOG: background worker "logical
replication worker" (PID 75649) exited with exit code 1
2023-01-05 13:39:12.272 IST [75675] LOG: logical replication apply
worker for subscription "test_sub" has started
2023-01-05 13:39:12.307 IST [75677] LOG: logical replication parallel
apply worker for subscription "test_sub" has started
2023-01-05 13:43:31.003 IST [75596] LOG: checkpoint starting: time
2023-01-05 13:46:32.045 IST [76337] LOG: logical replication parallel
apply worker for subscription "test_sub" has started
2023-01-05 13:46:35.214 IST [76337] LOG: logical replication parallel
apply worker for subscription "test_sub" has finished
2023-01-05 13:46:50.241 IST [76384] LOG: logical replication parallel
apply worker for subscription "test_sub" has started
2023-01-05 13:46:53.676 IST [76384] LOG: logical replication parallel
apply worker for subscription "test_sub" has finished

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dag Lem 2023-01-05 09:43:38 Re: daitch_mokotoff module
Previous Message Masahiko Sawada 2023-01-05 08:11:36 Re: Add index scan progress to pg_stat_progress_vacuum