Re: Perform streaming logical transactions by background workers and parallel apply

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2022-10-18 11:52:39
Message-ID: CAFiTN-vLFiEXP116_13EwLwtdbDCaBJsCWYNcouy=Z6BB5d77Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 6, 2022 at 1:37 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>

> While looking at v35 patch, I realized that there are some cases where
> the logical replication gets stuck depending on partitioned table
> structure. For instance, there are following tables, publication, and
> subscription:
>
> * On publisher
> create table p (c int) partition by list (c);
> create table c1 partition of p for values in (1);
> create table c2 (c int);
> create publication test_pub for table p, c1, c2 with
> (publish_via_partition_root = 'true');
>
> * On subscriber
> create table p (c int) partition by list (c);
> create table c1 partition of p for values In (2);
> create table c2 partition of p for values In (1);
> create subscription test_sub connection 'port=5551 dbname=postgres'
> publication test_pub with (streaming = 'parallel', copy_data =
> 'false');
>
> Note that while both the publisher and the subscriber have the same
> name tables the partition structure is different and rows go to a
> different table on the subscriber (eg, row c=1 will go to c2 table on
> the subscriber). If two current transactions are executed as follows,
> the apply worker (ig, the leader apply worker) waits for a lock on c2
> held by its parallel apply worker:
>
> * TX-1
> BEGIN;
> INSERT INTO p SELECT 1 FROM generate_series(1, 10000); --- changes are streamed
>
> * TX-2
> BEGIN;
> TRUNCATE c2; --- wait for a lock on c2
>
> * TX-1
> INSERT INTO p SELECT 1 FROM generate_series(1, 10000);
> COMMIT;
>
> This might not be a common case in practice but it could mean that
> there is a restriction on how partitioned tables should be structured
> on the publisher and the subscriber when using streaming = 'parallel'.
> When this happens, since the logical replication cannot move forward
> the users need to disable parallel-apply mode or increase
> logical_decoding_work_mem. We could describe this limitation in the
> doc but it would be hard for users to detect problematic table
> structure.

Interesting case. So I think the root of the problem is the same as
what we have for a column is marked unique to the subscriber but not
to the publisher. In short, two transactions which are independent of
each other on the publisher are dependent on each other on the
subscriber side because table definition is different on the
subscriber. So can't we handle this case in the same way by marking
this table unsafe for parallel-apply?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2022-10-18 12:03:14 Re: PATCH: Using BRIN indexes for sorted output
Previous Message Tomas Vondra 2022-10-18 11:33:59 PATCH: AM-specific statistics, with an example implementation for BRIN (WIP)