Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
> So far nobody bothered to make any other proposal how to cause the
> kernel to actually do some writing at all. A lot of people babble about
> fsync(), fdatasync() and fadvise and whatnot. A week ago I posted the
> proposal for this and got exactly zero response.
As I've said before, I think we need to find a way to stop using sync()
altogether --- we have to move to fsync or O_SYNC and variants. sync
has simply got the wrong API.
Let me give an example: you write a bunch of stuff and then call sync().
Suppose the kernel is unable to write some of those blocks --- it gets
a hard I/O error, or doesn't realize it's out of disk space until the
write is attempted, or whatever. (I think this is what happened to
Chris K-L last night.) Is the sync call going to tell you about the
problem? No, it is not. If you are lucky you will get an error return
from the next operation you try on a file descriptor associated with the
failed blocks. But by that time you've probably already written a
checkpoint record to WAL claiming that those writes were all done
successfully. Finding out about the failures after the checkpoint is
completed is too late --- you're screwed, especially if a crash happens
before you can do anything about it.
> The whole point of the bgwriter is to give responsetimes a better
> variance, I never claimed that it will improve performance.
I want to use it to improve reliability, by getting rid of our
dependence on sync(). The bgwriter can afford to wait for writes
to occur, so it should be able to use fsync or even O_SYNC.
regards, tom lane
In response to
pgsql-committers by date
|Next:||From: Tom Lane||Date: 2004-01-24 23:06:30|
|Subject: pgsql-server/ oc/src/sgml/pltcl.sgml rc/pl/tcl ...|
|Previous:||From: Tom Lane||Date: 2004-01-24 22:05:08|
|Subject: pgsql-server/doc/src/sgml plpgsql.sgml|