From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_get_publication_tables() output duplicate relid |
Date: | 2021-11-22 08:15:37 |
Message-ID: | CA+HiwqHYHsRZje7gfE49_ahMhVE=jyrBhU0uOWtu2cub_cFyvw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Nov 20, 2021 at 8:31 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Nov 19, 2021 at 10:58 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > On Fri, Nov 19, 2021 at 7:19 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > > The problematic case is attaching the partition *after* the subscriber
> > > has already marked the root parent as synced and/or ready for
> > > replication. Refreshing the subscription doesn't help it discover the
> > > newly attached partition, because a publish_via_partition_root only
> > > ever tells about the root parent, which would be already synced, so
> > > the subscriber would think there's nothing to copy.
> >
> > Okay, I see this could be a problem but I haven't tried to reproduce it.
> >
> > > > Anyway, if this is a problem
> > > > we need to figure the solution for this separately.
> > >
> > > Sure, we might need to do that after all. Though it might be a good
> > > idea to be sure that we won't need to reconsider the fix we push for
> > > the issue(s) being discussed here and elsewhere, because I suspect
> > > that the solution to the problem I mentioned is likely to involve
> > > tweaking pg_publication_tables view output.
>
> I have thought about this problem and I see two possibilities for a
> solution (a) We could provide a new option say 'truncate' (something
> on lines proposed here [1]) which would truncate the table(s) and
> change its status to 'i' in the pg_subscription_rel, this would allow
> the newly added partition to be synced after refresh. This could lead
> to a large copy in such a case.
Maybe I am missing something about the proposal, though I'd think a
more automatic solution would be better, something that doesn't need
to rely on an unrelated feature.
> (b) We could somehow get and store all
> the partition info from the publisher-side on the subscriber-side
> while initial sync (say in new system table
> pg_subscription_rel_members). Now, after the refresh, if this list
> changes, we can allow to just get the data of that particular
> partition but I guess it would mean that we need to store oids of the
> publisher which might or might not be safe considering oids can
> wraparound before the refresh.
>
> Do you have any other ideas?
I thought that the idea I had earlier mentioned at [1] may be useful,
which I can see is similar to your idea (b). I also suspect that it
can be implemented without needing a separate catalog and storing
publication-side relation OIDs in the subscription-side catalog,
though maybe I haven't thought hard enough.
> One more thing you mentioned is that the initial sync won't work after
> refresh but later changes will be replicated but I noticed that later
> changes also don't get streamed till we restart the subscriber server.
> I am not sure but we might not be invalidating apply workers cache due
> to which it didn't notice the same.
Oh, that sounds odd and, as you appear to say, a separate problem. I'll check.
--
Amit Langote
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andrey Borodin | 2021-11-22 08:23:49 | Re: Slow standby snapshot |
Previous Message | Andy Fan | 2021-11-22 07:43:23 | Re: Sequence's value can be rollback after a crashed recovery. |