Re: row filtering for logical replication

From: Alexey Zagarin <zagarin(at)gmail(dot)com>
To: a(dot)kondratov(at)postgrespro(dot)ru, Euler Taveira <euler(at)timbira(dot)com(dot)br>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Suzuki Hironobu <hironobu(at)interdb(dot)jp>
Subject: Re: row filtering for logical replication
Date: 2019-09-01 05:25:41
Message-ID: 66546be6-3daf-4918-9466-07bbc22c212c@Spark
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I think that I also have found one shortcoming when using the setup described by Alexey Kondratov. The problem that I face is that if both (cloud and remote) tables already have data the moment I add the subscription, then the whole table is copied in both directions initially. Which leads to duplicated data and broken replication because COPY doesn't take into account the filtering condition. In case there are filters in a publication, the COPY command that is executed when adding a subscription (or altering one to refresh a publication) should also filter the data based on the same condition, e.g. COPY (SELECT * FROM ... WHERE ...) TO ...

The current workaround is to always use WITH copy_data = false when subscribing or refreshing, and then manually copy data with the above statement.

Alexey Zagarin
On 1 Sep 2019 12:11 +0700, Euler Taveira <euler(at)timbira(dot)com(dot)br>, wrote:
> Em ter, 27 de ago de 2019 às 18:10, <a(dot)kondratov(at)postgrespro(dot)ru> escreveu:
> >
> > Do you have any plans for continuing working on this patch and
> > submitting it again on the closest September commitfest? There are only
> > a few days left. Anyway, I will be glad to review the patch if you do
> > submit it, though I didn't yet dig deeply into the code.
> >
> Sure. See my last email to this thread. I appreciate if you can review it.
>
> > Although almost all new tests are passed, there is a problem with DELETE
> > replication, so 1 out of 10 tests is failed. It isn't replicated if the
> > record was created with is_cloud=TRUE on cloud, replicated to remote;
> > then updated with is_cloud=FALSE on remote, replicated to cloud; then
> > deleted on remote.
> >
> That's because you don't include is_cloud in PK or REPLICA IDENTITY. I
> add a small note in docs.
>
>
> --
> Euler Taveira Timbira -
> http://www.timbira.com.br/
> PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
>
>
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2019-09-01 08:07:31 Write visibility map during CLUSTER/VACUUM FULL
Previous Message Michael Paquier 2019-09-01 05:15:10 Commit fest 2019-09