Quick Links

Re: Disaster!

From:	Christoph Haller <ch(at)rodos(dot)fzk(dot)de>
To:	pgman(at)candle(dot)pha(dot)pa(dot)us (Bruce Momjian)
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Disaster!
Date:	2004-01-29 16:13:08
Message-ID:	200401291513.QAA11098@rodos
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>
> Tom Lane wrote:
> > I said:
> > > If there wasn't disk space enough to hold the clog page, the checkpoint
> > > attempt should have failed. So it may be that allowing a short read in
> > > slru.c would be patching the symptom of a bug that is really elsewhere.
> >
> > After more staring at the code, I have a theory. SlruPhysicalWritePage
> > and SlruPhysicalReadPage are coded on the assumption that close() can
> > never return any interesting failure. However, it now occurs to me that
> > there are some filesystem implementations wherein ENOSPC could be
> > returned at close() rather than the preceding write(). (For instance,
> > the HPUX man page for close() states that this never happens on local
> > filesystems but can happen on NFS.) So it'd be possible for
> > SlruPhysicalWritePage to think it had successfully written a page when
> > it hadn't. This would allow a checkpoint to complete :-(
> >
> > Chris, what's your platform exactly, and what kind of filesystem are
> > you storing pg_clog on?
>
> We already have a TODO on fclose():
>
> * Add checks for fclose() failure
>
Tom was referring to close(), not fclose().
I once had an awful time searching for a memory leak caused
by a typo using close instead of fclose.
So adding checks for both is probably a good idea.

Regards, Christoph

In response to

Re: Disaster! at 2004-01-26 18:04:12 from Bruce Momjian

Responses

Re: Disaster! at 2004-01-29 16:19:48 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2004-01-29 16:19:48	Re: Disaster!
Previous Message	Jan Wieck	2004-01-29 15:31:40	Re: Getting the results columns before execution