Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Andres Freund <andres(at)anarazel(dot)de>, nag1010(at)gmail(dot)com, jdnelson(at)dyn(dot)com, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
Date: 2018-01-17 12:38:32
Message-ID: 20180117123832.GB2416@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Michael, Andres,

* Michael Paquier (michael(dot)paquier(at)gmail(dot)com) wrote:
> On Fri, Oct 27, 2017 at 3:13 AM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
> > On Thu, Oct 26, 2017 at 3:05 AM, Kyotaro HORIGUCHI
> > <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >> The largest obstacle to do that is that walreceiver is not
> >> utterly concerned to record internals. In other words, it doesn't
> >> know what a record is. Teaching that introduces much complexity
> >> and the complexity slows down the walreceiver.
> >>
> >> Addition to that, this "problem" occurs also on replication
> >> without a slot. The latest patch also help the case.
> >
> > That's why replication slots have been introduced to begin with. The
> > WAL receiver gives no guarantee that a segment will be retained or not
> > based on the beginning of a record. That's sad that the WAL receiver
> > does not track properly restart LSN and instead just uses the flush
> > LSN. I am beginning to think that a new message type used to report
> > the restart LSN when a replication slot is in use would just be a
> > better and more stable solution. I haven't looked at the
> > implementation details yet though.
>
> Yeah, I am still on track with this idea. Splitting the flush LSN and
> the restart LSN properly can allow for better handling on clients. For
> now I am moving this patch to the next CF.

Thanks for looking at this patch previously. This patch is still in
Needs Review but it's not clear if that's correct or not, but this seems
to be a bug-fix, so it would be great if we could make some progress on
it (particularly since it was reported over a year ago and goes back to
9.4, according to the thread, from what I can tell).

Any chance one of you can provide some further thoughts on it or make it
clear if we are waiting for a new patch or if the existing one is still
reasonable and just needs to be reviewed (in which case, perhaps one of
you could help with that..)?

Thanks again!

Stephen

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2018-01-17 14:26:22 BUG #15013: JNI-JDBC: org.postgresql.util.PSQLException: FEHLER: ungültiges Message-Format
Previous Message Aleksandr Parfenov 2018-01-16 07:52:26 Re: PostgreSQL crashes with SIGSEGV

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2018-01-17 13:00:21 Re: [HACKERS] Another oddity in handling of WCO constraints in postgres_fdw
Previous Message Geoff Winkless 2018-01-17 11:50:54 Re: proposal: alternative psql commands quit and exit