Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-19 09:42:14
Message-ID: CAA4eK1+79JVQ7dZqAE+0pc0f-AZn4J_F2X3nLdctGcAtcMp6Ew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jul 17, 2021 at 3:05 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> On 2021-Jul-16, Greg Nancarrow wrote:
>
> > On Fri, Jul 16, 2021 at 3:50 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > I am not so sure about different filters for old and new rows but it
> > > makes sense to by default apply the filter to both old and new rows.
> > > Then also provide a way for user to specify if the filter can be
> > > specified to just old or new row.
> >
> > I'm having some doubts and concerns about what is being suggested.
>
> Yeah. I think the idea that some updates fail to reach the replica,
> leaving the downstream database in a different state than it would be if
> those updates had reached it, is unsettling. It makes me wish we raised
> an error at UPDATE time if both rows would not pass the filter test in
> the same way -- that is, if the old row passes the filter, then the new
> row must be a pass as well.
>

Hmm, do you mean to say that raise an error in walsender while
decoding if old or new doesn't match filter clause? How would
walsender come out of that error? Even, if seeing the error user
changed the filter clause for publication, I think it would still see
the old ones due to historical snapshot and keep on getting the same
error. One idea could be that we use the current snapshot to read the
publications catalog table, then the user would probably change the
filter or do something to move forward from this error. The other
options could be:

a. Just log it and move to the next row
b. send to stats collector some info about this which can be displayed
in a view and then move ahead
c. just skip it like any other row that doesn't match the filter clause.

I am not sure if there is any use of sending a row if one of the
old/new rows doesn't match the filter. Because if the old row doesn't
match but the new one matches the criteria, we will anyway just throw
such a row on the subscriber instead of applying it. OTOH, if old
matches but new doesn't match then it probably doesn't fit the analogy
that new rows should behave similarly to Inserts. I am of opinion that
we should do either (a) or (c) when one of the old or new rows doesn't
match the filter clause.

> Maybe a second option is to have replication change any UPDATE into
> either an INSERT or a DELETE, if the old or the new row do not pass the
> filter, respectively. That way, the databases would remain consistent.
>

I guess such things should be handled via conflict resolution on the
subscriber side.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Luzanov 2021-07-19 09:44:50 postgresql.conf.sample missing units
Previous Message Dagfinn Ilmari Mannsåker 2021-07-19 09:33:03 Re: Replace remaining castNode(…, lfirst(…)) and friends calls with l*_node()