Re: bogus: logical replication rows/cols combinations

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: bogus: logical replication rows/cols combinations
Date: 2022-05-11 03:33:18
Message-ID: CAA4eK1JC8v8ysNS9L1et0_yyZJOEE1wB+GrvOcar_f5dVadvBg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 11, 2022 at 12:35 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 5/9/22 05:45, Amit Kapila wrote:
> > On Sun, May 8, 2022 at 11:41 PM Tomas Vondra
> > <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >>
> >> On 5/7/22 07:36, Amit Kapila wrote:
> >>> On Mon, May 2, 2022 at 6:11 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> >>>>
> >>>> On 2022-May-02, Amit Kapila wrote:
> >>>>
> >>>>
> >>>>> I think it is possible to expose a list of publications for each
> >>>>> walsender as it is stored in each walsenders
> >>>>> LogicalDecodingContext->output_plugin_private. AFAIK, each walsender
> >>>>> can have one such LogicalDecodingContext and we can probably share it
> >>>>> via shared memory?
> >>>>
> >>>> I guess we need to create a DSM each time a walsender opens a
> >>>> connection, at START_REPLICATION time. Then ALTER PUBLICATION needs to
> >>>> connect to all DSMs of all running walsenders and see if they are
> >>>> reading from it. Is that what you have in mind? Alternatively, we
> >>>> could have one DSM per publication with a PID array of all walsenders
> >>>> that are sending it (each walsender needs to add its PID as it starts).
> >>>> The latter might be better.
> >>>>
> >>>
> >>> While thinking about using DSM here, I came across one of your commits
> >>> f2f9fcb303 which seems to indicate that it is not a good idea to rely
> >>> on it but I think you have changed dynamic shared memory to fixed
> >>> shared memory usage because that was more suitable rather than DSM is
> >>> not portable. Because I see a commit bcbd940806 where we have removed
> >>> the 'none' option of dynamic_shared_memory_type. So, I think it should
> >>> be okay to use DSM in this context. What do you think?
> >>>
> >>
> >> Why would any of this be needed?
> >>
> >> ALTER PUBLICATION will invalidate the RelationSyncEntry entries in all
> >> walsenders, no? So AFAICS it should be enough to enforce the limitations
> >> in get_rel_sync_entry,
> >>
> >
> > Yes, that should be sufficient to enforce limitations in
> > get_rel_sync_entry() but it will lead to the following behavior:
> > a. The Alter Publication command will be successful but later in the
> > logs, the error will be logged and the user needs to check it and take
> > appropriate action. Till that time the walsender will be in an error
> > loop which means it will restart and again lead to the same error till
> > the user takes some action.
> > b. As we use historic snapshots, so even after the user takes action
> > say by changing publication, it won't be reflected. So, the option for
> > the user would be to drop their subscription.
> >
> > Am, I missing something? If not, then are we okay with such behavior?
> > If yes, then I think it would be much easier implementation-wise and
> > probably advisable at this point. We can document it so that users are
> > careful and can take necessary action if they get into such a
> > situation. Any way we can improve this in future as you also suggested
> > earlier.
> >
> >> which is necessary anyway because the subscriber
> >> may not be connected when ALTER PUBLICATION gets executed.
> >>
> >
> > If we are not okay with the resultant behavior of detecting this in
> > get_rel_sync_entry(), then we can solve this in some other way as
> > Alvaro has indicated in one of his responses which is to detect that
> > at start replication time probably in the subscriber-side.
> >
>
> IMO that behavior is acceptable.
>

Fair enough, then we should go with a simpler approach to detect it in
pgoutput.c (get_rel_sync_entry).

> We have to do that check anyway, and
> the subscription may start failing after ALTER PUBLICATION for a number
> of other reasons anyway so the user needs/should check the logs.
>

I think ALTER PUBLICATION won't ever lead to failure in walsender.
Sure, users can do something due to which subscriber-side failures can
happen due to constraint failures. Do you have some specific cases in
mind?

> And if needed, we can improve this and start doing the proactive-checks
> during ALTER PUBLICATION too.
>

Agreed.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2022-05-11 03:41:08 Re: First draft of the PG 15 release notes
Previous Message Amit Kapila 2022-05-11 03:20:21 Re: Column Filtering in Logical Replication