From: | "Maeldron T(dot)" <maeldron(at)gmail(dot)com> |
---|---|
To: | Pg Bugs <pgsql-bugs(at)postgresql(dot)org> |
Subject: | FATAL: terminating walreceiver process due to administrator command |
Date: | 2019-02-01 14:32:29 |
Message-ID: | CAKatfSnQP4gwpGNPxT6Gg-HFL9T6yefYaiSGhx=j5mrgOGV1Rg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Hello,
Today, I received email notifications from a server telling me the
replication was lagging.
The application is monitoring the delay. In the past, it happened a few
times I received this notification during a high load but not from this
server.
However, the replication was not lagging. It stopped.
There was a single line in the log of the standby server:
FATAL: terminating walreceiver process due to administrator command
I did not find anything related in the master’s log. The rest of the log
was all about the slow statements.
As far I can tell you, this never happened before. The standby server was
running but the replication stopped. I restarted the server after 42
minutes of that line in the log. The replication caught up in 3-5 seconds.
Only I have access to the servers.
I did not stop the replication process. I don’t even know how to do it.
There is no cron task that would do such thing. Only one application access
the database. I wrote it hence I know it didn’t do it either.
I have been running the servers for years with more or less the same
configuration.
As far as I see, when I see the same line in earlier logs, the database was
shut down as well. This was the only lonely line like that.
Recent changes on the servers:
* On 11 January, I upgraded from 10.5 to 10.6_2
* A few days ago, set up a new server that replicates one table from the
same master. This is a huge table but it’s rarely written. The replication
works. It’s the same time I used logical replication. The server where the
replication stopped uses async stream replication.
* When I set up the logical replication, I increased the wal_sender_timeout
I found nothing related in the logs (/var/log/messages, /var/log/all.log,
dmesg). This slave is probably the least loaded server of the group.
FreeBSD xxx 11.2-RELEASE-p8 FreeBSD 11.2-RELEASE-p8 #0: Tue Jan 8 21:35:12
UTC 2019 root(at)amd64-builder(dot)daemonology(dot)net:/usr/obj/usr/src/sys/GENERIC
amd64
/boot/loader.conf:
# PostgreSQL
kern.ipc.semmni=256
kern.ipc.semmns=512
kern.ipc.semmnu=256
Everything else is either FreeBSD default or unrelated.
There is a lot of free memory. I don’t mean usable but free. 3GB RAM was
not even touched since the last boot.
M.
From | Date | Subject | |
---|---|---|---|
Next Message | Petr Fedorov | 2019-02-01 14:32:43 | 'update returning *' returns 0 columns instead of empty row with 2 columns when (i) no rows updated and (ii) when applied to a partitioned table with sub-partition |
Previous Message | Thomas Munro | 2019-02-01 14:25:28 | Re: BUG #15548: Unaccent does not remove combining diacritical characters |