Re: Fixing the docs for ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Japin Li <japinli(at)hotmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject: Re: Fixing the docs for ALTER SUBSCRIPTION ... ADD/DROP PUBLICATION
Date: 2021-05-22 16:52:41
Message-ID: F03B678A-2697-4D31-9B64-D93AC4AFC781@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On May 21, 2021, at 10:39 PM, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
>
> On Sat, May 22, 2021 at 1:49 AM Mark Dilger
> <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>>
>> -hackers,
>>
>> I think commit 82ed7748b710e3ddce3f7ebc74af80fe4869492f created some confusion that should be cleaned up before release. I'd like some guidance on what the intended behavior is before I submit a patch for this, though:
>>
>> +ALTER SUBSCRIPTION mysubscription SET PUBLICATION nosuchpub WITH (copy_data = false, refresh = false);
>> +ALTER SUBSCRIPTION mysubscription ADD PUBLICATION nosuchpub WITH (copy_data = false, refresh = false);
>> +ALTER SUBSCRIPTION mysubscription DROP PUBLICATION nosuchpub WITH (copy_data = false, refresh = false);
>> +ERROR: unrecognized subscription parameter: "copy_data"
>> +ALTER SUBSCRIPTION mysubscription SET (copy_data = false, refresh = false);
>> +ERROR: unrecognized subscription parameter: "copy_data"
>>
>> First, it's quite odd to say that "copy_data" is unrecognized in the third and fourth ALTER commands when it was recognized just fine in the first two.
>
> For ALTER SUBSCRIPTION ... DROP PUBLICATION, copy_data option is not
> required actually, because it doesn't add new publications. If the
> concern here is "why refresh is allowed but not copy_data", then the
> answer is "with the refresh option the updated publications can be
> refreshed, this avoids users to run REFRESH PUBLICATION after DROP
> PUBLICATION". So, disallowing copy_data makes sense to me.

My concern isn't that the code is doing the wrong thing, but that the docs and the error messages are confusing. This is particularly troubling given that having a single action which combines the dropping of one publication with the refreshing of other publications is not particularly intuitive.

I agree that disallowing copy_data DROP PUBLICATION is a reasonable design choice, but I do not agree that this prohibition is intuitive. If I want to copy the data for a set of tables on a remote server, and only copy that data exactly once, I might be looking for an atomic action to do so. The docs are totally unclear on whether this is supported, so I might try:

CREATE SUBSCRIPTION tempsub CONNECTION 'dbname=remotedb' PUBLICATION remotepub WITH (connect = false, enabled = false, slot_name = NONE, create_slot = false);
ALTER SUBSCRIPTION tempsub DROP PUBLICATION remotepub WITH (refresh = true, copy_data = true);

with the intention that the data will be copied right before the publication is dropped. When I get an error that says 'unrecognized subscription parameter: "copy_data"', I'm likely to think I mistyped the parameter name, not that it is disallowed in this setting. If I then decide to just drop the publication (since my experiment didn't work) and try to do so using:

ALTER SUBSCRIPTION tempsub DROP PUBLICATION remotepub WITH (refresh = false, copy_data = false);

I seem to be playing by the rules, since I am explicitly not requesting "copy_data". That's what the "false" means. But again, the command complains that "copy_data" is unrecognized. At this point, I go back to the docs and it clearly says that "copy_data" is a supported parameter in this command. I'm totally confused.

I think the docs should say that "copy_data" is not allowed for DROP PUBLICATION. I think no error should occur for copy_data = false. For copy_data = true, I think the error message should say that copy_data is disallowed during a DROP PUBLICATION, rather than saying that the parameter is unrecognized.

> For ALTER SUBSCRIPTION ... SET, allowed options are slot_name,
> synchronous_commit, binary and streaming which are part of
> pg_subscription catalog and will be used by apply/sync workers.
> Whereas copy_data and refresh are not part of pg_subscription catalog
> and are not used by apply/sync workers (directly), but by the backend.
> We have ALTER SUBSCRIPTION .. REFRESH specifically for refresh and
> copy_data options.
>
>> More than that, though, the docs in doc/src/sgml/ref/alter_subscription.sgml refer to this part of the grammar in the first three ALTER commands as a "set_publication_option", not as a "subscription_parameter", a term which is only used in the grammar for other forms of the ALTER command. Per the grammar in the docs, "copy_data" is not a valid set_publication_option, only "refresh" is.
>
> set_publication_option - options are refresh and copy_data (this
> option comes implicitly, please see the note "Additionally, refresh
> options as described under REFRESH PUBLICATION may be specified.",
> under refresh_option we have copy_data)
>
> subscription_parameter - options are slot_name, synchronous_commit,
> binary, and streaming. This is correct.
>
>> Should the first three ALTER commands fail with an error about "copy_data" being an invalid set_publication_option? Should they succeed, in which case the docs should mention that "refresh" is not the only valid set_publication_option?
>
> No that's not correct. As I said above, set_publication_option options
> are both refresh and copy_data.

Well, not really. We're using the phrase "set_publication_option" for all three of SET PUBLICATION, ADD PUBLICATION, and DROP PUBLICATION. Since that's not really supported, we should use it only for the first two, and have a separate "drop_publication_option" for the third.

>> Something else, perhaps?
>
> Unless I misunderstood any of your concerns, I think the existing docs
> and the code looks correct to me.

Thanks for your response. The docs and error messages still don't look right to me.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2021-05-22 20:20:49 Buildfarm latest links
Previous Message Tom Lane 2021-05-22 15:32:43 Re: Subscription tests fail under CLOBBER_CACHE_ALWAYS