Re: bogus: logical replication rows/cols combinations

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: bogus: logical replication rows/cols combinations
Date: 2022-04-30 01:20:05
Message-ID: CAA4eK1JXDzOu5jTCYM-TcqE2i3iiFJQoWW_qoVGgZ=vMgEHi4w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 30, 2022 at 2:02 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 4/29/22 06:48, Amit Kapila wrote:
> > On Thu, Apr 28, 2022 at 11:00 PM Tomas Vondra
>
> I think such issues due to ALTER of the publication are somewhat
> expected, and I think users will understand they might need to resync
> the subscription or something like that.
>
> A similar example might be just changing the where condition,
>
> create publication p for table t where (a > 10);
>
> and then
>
> alter publication p set table t where (a > 15);
>
> If we replicated any rows with (a > 10) and (a <= 15), we'll just stop
> replicating them. But if we re-create the subscription, we end up with a
> different set of rows on the subscriber, omitting rows with (a <= 15).
>
> In principle we'd need to replicate the ALTER somehow, to delete or
> insert the rows that start/stop matching the row filter. It's a bit
> similar to not replicating DDL, perhaps.
>
> But I think the issue I've described is different, because you don't
> have to change the subscriptions at all and you'll still have the
> problem. I mean, imagine doing this:
>
> -- publisher
> create table t (a int primary key, b int);
> create publication p for table t where (a > 10) with (publish='update');
>
> -- subscriber
> create table t (a int primary key, b int);
> create subscription s connection '...' publication p;
>
> -- publisher
> insert into t select i, i from generate_series(1,20) s(i);
> update t set b = b * 10;
>
> -- subscriber
> --> has no rows in "t"
> --> recreate the subscription
> drop subscription s;
> create subscription s connection '...' publication p;
>
> --> now it has all the rows with (a>10), because tablesync ignores
> publication actions
>
>
> The reason why I find this really annoying is that it makes it almost
> impossible to setup two logical replicas that'd be "consistent", unless
> you create them at the same time (= without any writes in between). And
> it's damn difficult to think about the inconsistencies.
>

I understood your case related to the initial sync and it is with or
without rowfilter.

>
> IMHO this all stems from allowing row filters and restricting pubactions
> at the same time (notice this only used a single publication). So maybe
> the best option would be to disallow combining these two features? That
> would ensure the row filter filter is always applied to all actions in a
> consistent manner, preventing all these issues.
>
> Maybe that's not possible - maybe there are valid use cases that would
> need such combination, and you mentioned replica identity might be an
> issue
>

Yes, that is the reason we can't combine the row filters for all pubactions.

> (and maybe requiring RIF with row filters is not desirable).
>
> So maybe we should at least warn against this in the documentation?
>

Yeah, I find this as the most suitable thing to do to address your
concern. I would like to add this information to the 'Initial
Snapshot' page with some examples (both with and without a row
filter).

> >
> > True, I think to some extent we rely on users to define it sanely
> > otherwise currently also it can easily lead to even replication being
> > stuck. This can happen when the user is trying to operate on the same
> > table and define publication/subscription on multiple nodes for it.
> > See [1] where we trying to deal with such a problem.
> >
> > [1] - https://commitfest.postgresql.org/38/3610/
> >
>
> That seems to deal with a circular replication, i.e. two logical
> replication links - a bit like a multi-master. Not sure how is that
> related to the issue we're discussing here?
>

It is not directly related to what we are discussing here but I was
trying to emphasize the point that users need to define the logical
replication via pub/sub sanely otherwise they might see some weird
behaviors like that.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-04-30 05:37:55 Progress report removal of temp files and temp relation files using ereport_startup_progress
Previous Message Tom Lane 2022-04-29 23:26:59 Re: failures in t/031_recovery_conflict.pl on CI