Re: Postgres crash? could not write to log file: No space left on device

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Yuri Levinsky <yuril(at)celltick(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: Postgres crash? could not write to log file: No space left on device
Date: 2013-06-26 13:04:00
Message-ID: 20130626130400.GB6660@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2013-06-26 15:40:08 +0300, Heikki Linnakangas wrote:
> On 26.06.2013 15:21, Andres Freund wrote:
> >On 2013-06-26 13:14:37 +0100, Greg Stark wrote:
> >>On Wed, Jun 26, 2013 at 12:57 AM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >>> (Though if it is, it's not apparent why such
> >>>failures would only be manifesting on the pg_xlog files and not for
> >>>anything else.)
> >>
> >>Well data files are only ever written to in 8k chunks. Maybe these
> >>errors are only occuring on>8k xlog records such as records with
> >>multiple full page images. I'm not sure how much we write for other
> >>types of files but they won't be written to as frequently as xlog or
> >>data files and might not cause errors that are as noticeable.
> >
> >We only write xlog in XLOG_BLCKSZ units - which is 8kb by default as
> >well...
>
> Actually, XLogWrite() writes multiple pages at once. If all wal_buffers are
> dirty, it can try to write them all in one write() call.

Oh. Misremembered that.

> We've discussed retrying short writes before, and IIRC Tom has argued that
> it shouldn't be necessary when writing to disk. Nevertheless, I think we
> should retry in XLogWrite(). It can write much bigger chunks than most
> write() calls, so there's more room for a short write to happen t$here if it
> can happen at all. Secondly, it PANICs on failure, so it would be nice to
> try a bit harder to avoid that.

At the very least we should log the amount of bytes actually writen if
it was a short write to make it possible to discern that case from the
direct ENOSPC response.

This might also be caused by the fact that until recently the SIGALRM
handler didn't set SA_RESTART... If a backend decided to write out the
xlog directly it very well might have an active alarm...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2013-06-26 14:15:25 Re: Postgres crash? could not write to log file: No space left on device
Previous Message Yuri Levinsky 2013-06-26 13:03:16 Re: Postgres crash? could not write to log file: No spaceleft on device