Quick Links

Re: row filtering for logical replication

From:	Stephen Frost <sfrost(at)snowman(dot)net>
To:	Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>, Euler Taveira de Oliveira <euler(at)timbira(dot)com(dot)br>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, hironobu(at)interdb(dot)jp
Subject:	Re: row filtering for logical replication
Date:	2018-12-27 19:19:25
Message-ID:	20181227191925.GJ2528@tamriel.snowman.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Greetings,

* Petr Jelinek (petr(dot)jelinek(at)2ndquadrant(dot)com) wrote:
> On 14/12/2018 16:56, Stephen Frost wrote:
> > * Tomas Vondra (tomas(dot)vondra(at)2ndquadrant(dot)com) wrote:
> >> On 11/23/18 8:03 PM, Stephen Frost wrote:
> >>> * Fabrízio de Royes Mello (fabriziomello(at)gmail(dot)com) wrote:
> >>>> On Fri, Nov 23, 2018 at 4:13 PM Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
> >>>> wrote:
> >>>>>> If carefully documented I see no problem with it... we already have an
> >>>>>> analogous problem with functional indexes.
> >>>>>
> >>>>> The difference is that with functional indexes you can recreate the
> >>>>> missing object and everything is okay again. With logical replication
> >>>>> recreating the object will not help.
> >>>>>
> >>>>
> >>>> In this case with logical replication you should rsync the object. That is
> >>>> the price of misunderstanding / bad use of the new feature.
> >>>>
> >>>> As usual, there are no free beer ;-)
> >>>
> >>> There's also certainly no shortage of other ways to break logical
> >>> replication, including ways that would also be hard to recover from
> >>> today other than doing a full resync.
> >>
> >> Sure, but that seems more like an argument against creating additional
> >> ones (and for preventing those that already exist). I'm not sure this
> >> particular feature is where we should draw the line, though.
> >
> > I was actually going in the other direction- we should allow it because
> > advanced users may know what they're doing better than we do and we
> > shouldn't prevent things just because they might be misused or
> > misunderstood by a user.
>
> That's all good, but we need good escape hatch for when things go south
> and we don't have it and IMHO it's not as easy to have one as you might
> think.

We don't have a great solution but we should be able to at least drop
and recreate the publication or subscription, even today, can't we?
Sure, that means having to recopy everything, but that's what you get if
you break your publication/subscription. If we allow the user to get to
a point where the system can't be fixed then I agree that's a serious
issue, but hopefully that isn't the case.

> >>> What that seems to indicate, to me at least, is that it'd be awful
> >>> nice to have a way to resync the data which doesn't necessairly
> >>> involve transferring all of it over again.
> >>>
> >>> Of course, it'd be nice if we could track those dependencies too,
> >>> but that's yet another thing.
> >>
> >> Yep, that seems like a good idea in general. Both here and for
> >> functional indexes (although I suppose sure is a technical reason why it
> >> wasn't implemented right away for them).
> >
> > We don't track function dependencies in general and I could certainly
> > see cases where you really wouldn't want to do so, at least not in the
> > same way that we track FKs or similar. I do wonder if maybe we didn't
> > track function dependencies because we didn't (yet) have create or
> > replace function and that now we should. We don't track dependencies
> > inside a function either though.
>
> Yeah we can't always have dependencies, it would break some perfectly
> valid usage scenarios. Also it's not exactly clear to me how we'd track
> dependencies of say plpython function...

Well, we could at leasts depend on the functions explicitly listed at
the top level and I don't believe we even do that today. I can't think
of any downside off-hand to that, given that we have create-or-replace
function.

> >>> In short, I'm not sure that I agree with the idea that we shouldn't
> >>> allow this and instead I'd rather we realize it and put the logical
> >>> replication into some kind of an error state that requires a resync.
> >>
> >> That would still mean a need to resync the data to recover, so I'm not
> >> sure it's really an improvement. And I suppose it'd require tracking the
> >> dependencies, because how else would you mark the subscription as
> >> requiring a resync? At which point we could decline the DROP without a
> >> CASCADE, just like we do elsewhere, no?
> >
> > I was actually thinking more along the lines of just simply marking the
> > publication/subscription as being in a 'failed' state when a failure
> > actually happens, and maybe even at that point basically throwing away
> > everything except the shell of the publication/subscription (so the user
> > can see that it failed and come in and properly drop it); I'm thinking
> > about this as perhaps similar to a transaction being aborted.
>
> There are several problems with that. First this happens in historic
> snapshot which can't write and on top of that we are in the middle of
> error processing so we have our hands tied a bit, it's definitely going
> to need bit of creative thinking to do this.

We can't write to things inside the database in a historic snapshot and
we do have to deal with the fact that we're in error processing. What
about writing somewhere that's outside of the regular database system?
Maybe a pg_logical/failed directory? There's all the usual
complications from that around dealing with durable writes (if we need
to worry about that and I'm not sure that we do... if we fail to
persist a write saying "X failed" and we restart.. well, it's gonna fail
again and we write it then), and cleaning things up as needed (but maybe
this is handled as part of the DROP, and we WAL that, so we can re-do
the removal of the failed marker file...), and if we need to think about
what should happen on replicas (is there anything?).

> Second, and that's more soft issue (which is probably harder to solve)
> what do we do with the slot and subscription. There is one failed
> publication, but the subscription may be subscribed to 20 of them, do we
> kill the whole subscription because of single failed publication? If we
> don't do we continue replicating like nothing has happened but with data
> in the failed publication missing (which can be considered data
> loss/corruption from the view of user). If we stop replication, do we
> clean the slot so that we don't keep back wal/catalog xmin forever
> (which could lead to server stopping) or do we keep the slot so that
> user can somehow fix the issue (reconfigure subscription to not care
> about that publication for example) and continue replication without
> further loss?

I would think we'd have to fail the whole publication if there's a
failure for any part of it. Replicating a partial set definitely sounds
wrong to me. Once we stop replication, yes, we should clean the slot
and mark it failed so that we don't keep back WAL and so that we allow
the catalog xmin to move forward so that the failed publication doesn't
run the server out of disk space.

If we really think there's a use-case for keeping the replication slot
and allowing it to cause WAL to spool on the server and keep the catalog
xmin back then I'd suggest we make this behavior configurable- so that
users can choose on a publication if they want a failure to be
considered a 'soft' fail or a 'hard' fail. A 'soft' fail would keep the
slot and keep the WAL and keep the catalog xmin, with the expectation
that the user will either drop the slot themselves or somehow fix it,
while a 'hard' fail would clean everything up except the skeleton of the
slot itself which the user would need to drop.

Thanks!

Stephen

In response to

Re: row filtering for logical replication at 2018-12-15 14:23:39 from Petr Jelinek

Responses

Re: row filtering for logical replication at 2018-12-27 23:36:25 from Petr Jelinek

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2018-12-27 19:22:11	Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Previous Message	Stephen Frost	2018-12-27 19:05:11	Re: row filtering for logical replication