Re: BUG #14830: Missed NOTIFications, PostgreSQL 9.1.24

From: Marko Tiikkaja <marko(at)joh(dot)to>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: BUG #14830: Missed NOTIFications, PostgreSQL 9.1.24
Date: 2017-10-09 14:52:16
Message-ID: CAL9smLAPgt9vy_dstsdz-LBpa0PgKs8aiucVkMUV0u=SM6=fJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Oct 3, 2017 at 5:00 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:

> On Wed, Sep 27, 2017 at 3:29 AM, <marko(at)joh(dot)to> wrote:
> > So.. any ideas? Unfortunately I can't reproduce this in an isolated
> > environment, and in production this seems to be taking some time before
> it
> > builds up into a proper issue.
>
> Hm. Could it be some side-effect from 2bbe8a68? This has been
> backpatched on all branches and it is part of 9.1.24.

Unsure. I can at least reproduce this with only one session ever listening
on anything.

I've produced a test case[1] for this which matches roughly what the code
does in production. The code isn't the most pretty code out there, but
basically what it does is that it has one session LISTENing on a channel,
and 24 sessions sending messages with a prefix in order, so for example:

session 1 sends A_1, A_2, A_3, etc.
session 2 sends B_1, B_2, B_3, ...

and the listener has a map recording what the last received number is for
each prefix, checking that all notifications are received and in the right
order.

After running it for a few days I start getting logged messages such as:

out of order notification Q_97882353: 97882353 != 97882349 + 1 (prefix Q)
out of order notification F_97947433: 97947433 != 97947429 + 1 (prefix F)
out of order notification F_97947439: 97947439 != 97947436 + 1 (prefix F)

I did it on both 9.1.24 and 9.6.5 and they both exhibit the same behavior:
it takes days to get into this state, but then notifications are missed all
the time. I currently have both systems in this state, so any idea what to
look at to try and debug this further?

.m

[1]: https://github.com/johto/notify-test

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2017-10-09 15:15:10 Re: BUG #14830: Missed NOTIFications, PostgreSQL 9.1.24
Previous Message Aleksander Alekseev 2017-10-09 14:13:41 10.0: Logical replication doesn't execute BEFORE UPDATE OF <columns> trigger