str replication failed, restart fixed it

From: Willy-Bas Loos <willybas(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: str replication failed, restart fixed it
Date: 2014-02-26 09:53:56
Message-ID: CAHnozThWqsfALkwcqziuuvTP=F_7z3fZvR6EtO8589KWZWEDmw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I had a problem today and i fixed it by restarting postgres.
That doesn't seem to make sense to me, what could have been going on?

This is the log:
2014-02-26 04:30:45 CET db: ip: us: FATAL: could not send data to WAL
stream: SSL error: sslv3 alert unexpected message

cp: cannot stat
`/data/postgresql/9.1/main/wal_archive/000000010000006400000062': No such
file or directory
2014-02-26 04:30:45 CET db: ip: us: LOG: unexpected pageaddr 64/3FBC6000
in log file 100, segment 98, offset 12345344
cp: cannot stat
`/data/postgresql/9.1/main/wal_archive/000000010000006400000062': No such
file or directory
2014-02-26 04:30:45 CET db: ip: us: LOG: streaming replication
successfully connected to primary
2014-02-26 04:32:09 CET db: ip: us: LOG: startup process (PID 5385) was
terminated by signal 7: Bus error
2014-02-26 04:32:09 CET db: ip: us: LOG: terminating any other active
server processes

The cluster was "online" according to pg_lsclusters, but it was not
possible to connect to it:
psql: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?

uptime tells me this:
postgres(at)myserver:~$ uptime
10:47:27 up 89 days, 42 min, 1 user, load average: 0.00, 0.00, 0.00

This is postgresql 9.1 on Ubuntu 12.04 on OpenVZ

The weirdest thing is that restarting the postgres cluster fixed it.
Does this make any sense to you?

Cheers,

WBL
--
"Quality comes from focus and clarity of purpose" -- Mark Shuttleworth

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tomas Vondra 2014-02-26 10:59:53 Re: cannot delete corrupted rows after DB corruption: tuple concurrently updated
Previous Message john gale 2014-02-26 07:45:18 Re: cannot delete corrupted rows after DB corruption: tuple concurrently updated