Re: Proposed LogWriter Scheme, WAS: Potential Large Performance

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Curtis Faith <curtis(at)galtair(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql-Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposed LogWriter Scheme, WAS: Potential Large Performance
Date: 2002-10-05 18:26:43
Message-ID: 200210051826.g95IQh605752@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Curtis Faith wrote:
> The advantage to aio_write in this scenario is when writes cross track
> boundaries or when the head is in the wrong spot. If we write
> in reasonable blocks with aio_write the write might get to the disk
> before the head passes the location for the write.
>
> Consider a scenario where:
>
> Head is at file offset 10,000.
>
> Log contains blocks 12,000 - 12,500
>
> ..time passes..
>
> Head is now at 12,050
>
> Commit occurs writing block 12,501
>
> In the aio_write case the write would already have been done for blocks
> 12,000 to 12,050 and would be queued up for some additional blocks up to
> potentially 12,500. So the write for the commit could occur without an
> additional rotation delay. We are talking 85 to 200 milliseconds
> delay for this rotation on a single disk. I don't know how often this
> happens in actual practice but it might occur as often as every other
> time.

So, you are saying that we may get back aio confirmation quicker than if
we issued our own write/fsync because the OS was able to slip our flush
to disk in as part of someone else's or a general fsync?

I don't buy that because it is possible our write() gets in as part of
someone else's fsync and our fsync becomes a no-op, meaning there aren't
any dirty buffers for that file. Isn't that also possible?

Also, remember the kernel doesn't know where the platter rotation is
either. Only the SCSI drive can reorder the requests to match this. The
OS can group based on head location, but it doesn't know much about the
platter location, and it doesn't even know where the head is.

Also, does aio return info when the data is in the kernel buffers or
when it is actually on the disk?

Simply, aio allows us to do the write and get notification when it is
complete. I don't see how that helps us, and I don't see any other
advantages to aio. To use aio, we need to find something that _can't_
be solved with more traditional Unix API's, and I haven't seen that yet.

This aio thing is getting out of hand. It's like we have a hammer, and
everything looks like a nail, or a use for aio.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2002-10-05 18:29:45 Re: Proposed LogWriter Scheme, WAS: Potential Large
Previous Message Doug McNaught 2002-10-05 17:53:49 Re: Use of sync() [was Re: Potential Large Performance Gain in WAL synching]