Re: Proposed LogWriter Scheme, WAS: Potential Large Performance

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Cc: Curtis Faith <curtis(at)galtair(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Proposed LogWriter Scheme, WAS: Potential Large Performance
Date: 2002-10-05 12:01:01
Message-ID: 200210051201.g95C12C19377@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

pgman wrote:
> Curtis Faith wrote:
> > Back-end servers would not issue fsync calls. They would simply block
> > waiting until the LogWriter had written their record to the disk, i.e.
> > until the sync'd block # was greater than the block that contained the
> > XLOG_XACT_COMMIT record. The LogWriter could wake up committed back-
> > ends after its log write returns.
> >
> > The log file would be opened O_DSYNC, O_APPEND every time. The LogWriter
> > would issue writes of the optimal size when enough data was present or
> > of smaller chunks if enough time had elapsed since the last write.
>
> So every backend is to going to wait around until its fsync gets done by
> the backend process? How is that a win? This is just another version
> of our GUC parameters:
>
> #commit_delay = 0 # range 0-100000, in microseconds
> #commit_siblings = 5 # range 1-1000
>
> which attempt to delay fsync if other backends are nearing commit.
> Pushing things out to another process isn't a win; figuring out if
> someone else is coming for commit is. Remember, write() is fast, fsync
> is slow.

Let me add to what I just said:

While the above idea doesn't win for normal operation, because each
backend waits for the fsync, and we have no good way of determining of
other backends are nearing commit, a background WAL fsync process would
be nice if we wanted an option between fsync on (wait for fsync before
reporting commit), and fsync off (no crash recovery).

We could have a mode where we did an fsync every X milliseconds, so we
issue a COMMIT to the client, but wait a few milliseconds before
fsync'ing. Many other databases have such a mode, but we don't, and I
always felt it would be valuable. It may allow us to remove the fsync
option in favor of one that has _some_ crash recovery.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

Browse pgsql-hackers by date

  From Date Subject
Next Message Curtis Faith 2002-10-05 13:01:21 Re: Proposed LogWriter Scheme, WAS: Potential Large PerformanceGain in WAL synching
Previous Message Bruce Momjian 2002-10-05 11:49:52 Re: Proposed LogWriter Scheme, WAS: Potential Large Performance