Re: Proposed LogWriter Scheme, WAS: Potential Large

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Curtis Faith <curtis(at)galtair(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql-Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposed LogWriter Scheme, WAS: Potential Large
Date: 2002-10-05 13:44:12
Message-ID: 1033825452.9687.16.camel@taru.tm.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian kirjutas L, 05.10.2002 kell 13:49:
> Curtis Faith wrote:
> > Back-end servers would not issue fsync calls. They would simply block
> > waiting until the LogWriter had written their record to the disk, i.e.
> > until the sync'd block # was greater than the block that contained the
> > XLOG_XACT_COMMIT record. The LogWriter could wake up committed back-
> > ends after its log write returns.
> >
> > The log file would be opened O_DSYNC, O_APPEND every time. The LogWriter
> > would issue writes of the optimal size when enough data was present or
> > of smaller chunks if enough time had elapsed since the last write.
>
> So every backend is to going to wait around until its fsync gets done by
> the backend process? How is that a win? This is just another version
> of our GUC parameters:
>
> #commit_delay = 0 # range 0-100000, in microseconds
> #commit_siblings = 5 # range 1-1000
>
> which attempt to delay fsync if other backends are nearing commit.
> Pushing things out to another process isn't a win; figuring out if
> someone else is coming for commit is.

Exactly. If I understand correctly what Curtis is proposing, you don't
have to figure it out under his scheme - you just issue a WALWait
command and the WAL writing process notifies you when your transactions
WAL is safe storage.

If the other committer was able to get his WALWait in before the actual
write took place, it will notified too, if not, it will be notified
about 1/166th sec. later (for 10K rpm disk) when it's write is done on
the next rev of disk platters.

The writer process should just issue a continuous stream of
aio_write()'s while there are any waiters and keep track which waiters
are safe to continue - thus no guessing of who's gonna commit.

If supported by platform this should use zero-copy writes - it should be
safe because WAL is append-only.

-----------
Hannu

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Curtis Faith 2002-10-05 14:39:37 Re: Threaded Sorting
Previous Message Curtis Faith 2002-10-05 13:41:49 Anyone else having list server problems?