From: | "Euler Taveira" <euler(at)eulerto(dot)com> |
---|---|
To: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "'Shubham Khanna'" <khannashubham1197(at)gmail(dot)com> |
Cc: | "vignesh C" <vignesh21(at)gmail(dot)com>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Peter Smith" <smithpb2250(at)gmail(dot)com> |
Subject: | Re: Add support for specifying tables in pg_createsubscriber. |
Date: | 2025-08-22 15:26:29 |
Message-ID: | 30cc34eb-07a0-4b55-b4fe-6c526886b2c4@app.fastmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Aug 22, 2025, at 6:57 AM, Zhijie Hou (Fujitsu) wrote:
> The documentation appears incorrect and needs revision. The latest version no
> longer depends on the option order; instead, it requires users to provide
> database-qualified table names, such as -t "db1.sch1.tb1". This adjustment
> allows the command to internally categorize tables by their target database.
>
I don't like this design. There is no tool that uses 3 elements. It is also
confusing and redundant to have the database in the --database option and also
in the --table option.
I'm wondering if we allow using a specified publication is a better UI. If you
specify --publication and it exists on primary, use it. The current behavior is
a failure if the publication exists. It changes the current behavior but I
don't expect someone relying on this failure to abort the execution. Moreover,
the error message was added to allow only FOR ALL TABLES; the proposal is to
relax this restriction.
> I think we can explore extending the existing --clean option in a separate patch
> to support table cleanup. This option is implemented in a way that allows adding
> further cleanup objects later, so it should be easy to extend it for table.
> Prior to this extension, it should be noted in the documentation that users are
> required to clean up the tables themselves.
>
I would say that these cleanup feature (starting with the cleanup databases) is
equally important as the feature that selects specific objects.
> I agree that supporting row filter and column list is not straightforward, and
> we can consider it separately and do not implement that in the first version.
>
The proposal above would allow it with no additional lines of code.
>>
>> It seems this proposal doesn't serve a general purpose. It is copying a *whole*
>> cluster to use only a subset of tables. Your task with pg_createsubscriber is
>> more expensive than doing a manual logical replication setup. If you have 500
>> tables and want to replicate only 400 tables, it doesn't seem productive to
>> specify 400 -t options.
>
> Specifying multiple -t options should not be problematic, as users has already
> done similar things for "FOR TABLE" publication DDLs. I think it's not hard
> for user to convert FOR TABLE list to -t option list.
>
Of course it is. Shell limits the number of arguments.
>> There are some cases like a small set of big tables that
>> this feature makes sense. However, I'm wondering if a post script should be
>> used to adjust your setup.
>
> I think it's not very convenient for users to perform this conversion manually.
> I've learned in PGConf.dev this year that some users avoid using
> pg_createsubscriber because they are unsure of the standard steps required to
> convert it into subset table replication. Automating this process would be
> beneficial, enabling more users to use pg_createsubscriber and take advantage of
> the rapid initial table synchronization.
>
You missed my point. I'm not talking about manually converting a physical
replica into a logical replica. I'm talking about the plain logical replication
setup (CREATE PUBLICATION, CREATE SUBSCRIPTION). IME this tool is beneficial
for large clusters that we want to replicate (almost) all tables.
--
Euler Taveira
EDB https://www.enterprisedb.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Sergey Soloviev | 2025-08-22 15:27:16 | [BUG] Remove self joins causes 'variable not found in subplan target lists' error |
Previous Message | Tom Lane | 2025-08-22 15:03:44 | Re: Identifying function-lookup failures due to argument name mismatches |