Quick Links

Re: Next steps in debugging database storage problems?

From:	Jacob Bunk Nielsen <jacob(at)bunk(dot)cc>
To:	Jacob Bunk Nielsen <jacob(at)bunk(dot)cc>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Next steps in debugging database storage problems?
Date:	2014-08-15 07:23:23
Message-ID:	spamdrop+87sikyb4ic.fsf@atom.bunk.cc
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On the 1st of July 2014 Jacob Bunk Nielsen <jacob(at)bunk(dot)cc> wrote:

> We have a PostgreSQL 9.3.4 running in an LXC container on Debian
> Wheezy on a Linux 3.10.43 kernel on a Dell R620 server. Data are
> stored on a XFS file system. We are seeing problems such as:
>
> unexpected data beyond EOF in block 2 of relation base/805208133/1238511128
>
> and
>
> could not read block 5 in file "base/805208348/1259338118": read only
> 0 of 8192 bytes
>
> This seems to occur every few days after the server has been up for
> 30-40 days. If we reboot the server it'll be another 30-40 days before
> we see any problems again.
>
> The server has been running fine on a Dell R710 for a long time, and was
> upgraded to a Dell R620 last year, when the problems started. We have
> tried switching to a different Dell R620, but that did not make a
> difference. We've seen this with kernels 3.2, 3.4 and 3.10.

This time it took 45 days before this happened:

LOG: unexpected EOF on standby connection
ERROR: unexpected data beyond EOF in block 140 of relation base/805208885/805209852
HINT: This has been seen to occur with buggy kernels; consider updating your system.

It always happens with small tables with lots of inserts and deletes.
From previous experience we know that it's now going to happen again in
a few days, so we'll probably try to schedule a reboot to give us
another 30-40 days.

Is anyone else seeing problems with PostgreSQL on XFS filesystems?

Any hints on how to debug what goes wrong here would be still be greatly
appreciated.

> We have multiple other PostgreSQL servers running in a similar setup
> without causing any problems, but this server is probably the busiest of
> our PostgreSQL servers.

This is still the case.

Best regards

Jacob

In response to

Next steps in debugging database storage problems? at 2014-07-01 13:35:19 from Jacob Bunk Nielsen

Responses

Re: Next steps in debugging database storage problems? at 2014-08-15 17:00:51 from Terry Schmitt
Re: Next steps in debugging database storage problems? at 2014-12-11 08:31:12 from Jacob Bunk Nielsen

Browse pgsql-general by date

	From	Date	Subject
Next Message	FarjadFarid(ChkNet)	2014-08-15 14:23:18	list of index
Previous Message	Joseph Kregloh	2014-08-14 21:08:00	Re: Best practices for cloning DB servers