Proposal: Implement a new option for COMMIT, for enhancing performance,
providing a MySQL-like trade-off between performance and robustness for
*only* those that want it.
This form of COMMIT will *not* perform XLogFlush(), but will rely on a
special background process to perform regular WAL fsyncs (see later).
COMMIT NOWAIT can co-exist with the normal form of COMMIT and does not
threaten the consistency or robustness of other COMMIT modes. Read that
again and think about it, before we go further, please. Normal COMMIT
still guarantees to flush all of WAL up to the point of the commit,
whether or not the previous commits have requested that.
Mixing COMMIT NOWAIT with other modes does not effect the performance of
other backends - those that specify that mode are faster, those that do
not simply go at the same speed they did before. This is important,
because it allows us to have a fully robust server, yet with certain
critical applications going along much faster. No need for an
all-or-nothing approach at db cluster level.
Unlike fsync = off, WAL is always consistent and the server can be
recovered easily, though with some potential for data loss for
transactions that chose the COMMIT NOWAIT option. Sounds like a hole
there: normal COMMITs that rely on data written by COMMIT NOWAIT
transactions are still safe, because the normal COMMIT is still bound by
the guarantee to go to disk. The buffer manager/WAL interlock is not
effected by this change and remains in place, as it should.
This implements the TODO item:
--Allow buffered WAL writes and fsync
"Instead of guaranteeing recovery of all committed transactions, this
would provide improved performance by delaying WAL writes and fsync so
an abrupt operating system restart might lose a few seconds of committed
transactions but still be consistent. We could perhaps remove the
'fsync' parameter (which results in an an inconsistent database) in
favor of this capability."
Why do we want this?? Because some apps have *lots* of data and many
really don't care whether they lose a few records. Honestly, I've met
people that want this, even after 2 hours of discussion and
understanding. Plus probably lots of MySQLers also.
New commit mode is available by explicit command, or as a default
setting that will be applied to all COMMITs, or both.
The full syntax would be COMMIT [WRITE] NOWAIT [IMMEDIATE], for Oracle
compatibility (why choose incompatibility?). Note that this is not a
transaction start setting like Isolation Level; this happens at end of
transaction. The syntax for END is unchanged, defaulting to normal
behaviour unless overridden.
New userset GUC, commit_wait_default = on (default) | off
We change the meaning of the commit_delay parameter:
- If commit_delay = 0 then commit_wait_default cannot be set off.
- WAL will be flushed every commit_delay milliseconds; if no flush is
required this will do nothing very quickly, so there is little overhead
of no COMMIT NOWAIT commits have been made.
COMMIT NOWAIT in xact.c simply ignores XLogFlush and returns.
Who does the XLogFlush? Well, my recommendation is a totally new
process, WALWriter. But I can see that many of you will say bgwriter
should be the person to do this work. IMHO doing WAL flushes will take
time and thats time that bgwriter really needs to do other things, plus
it can't really guarantee to do flush regularly when its doing
When commit_delay > 0 then the WALwriter will startup, or shutdown if
commit_delay = 0.
WALWriter will XLogFlush every commit_delay milliseconds.
A prototype patch is posted to -patches, which is WORK IN PROGRESS.
The following TODO items remain
1. discuss which process will issue regular XLogFlush(). If agreed,
implement WALWriter process to perform this task. (Yes, the patch isn't
fully implemented, yet).
2. remove fsync parameter
3. Prevent COMMIT NOWAIT when commit_delay = 0
4. Discuss whether commit_delay is OK to usurp; twas just an earlier
suggestion from someone else, can go either way.
The remaining items can be completed very quickly if this proposal is
acceptable. (I wrote this over Christmas, so it turning up now isn't a
rushed proposal and I'm pretty certain it ain't broke).
pgsql-hackers by date
|Next:||From: Andrew Dunstan||Date: 2007-02-26 23:02:58|
|Subject: Re: Seeking Google SoC Mentors|
|Previous:||From: Simon Riggs||Date: 2007-02-26 22:56:45|
|Subject: COMMIT NOWAIT Performance Option (patch)|