Re: O_DIRECT support for Windows

From: "Takayuki Tsunakawa" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>
To: "Magnus Hagander" <magnus(at)hagander(dot)net>, "ITAGAKI Takahiro" <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: <pgsql-patches(at)postgresql(dot)org>
Subject: Re: O_DIRECT support for Windows
Date: 2007-01-16 01:59:11
Message-ID: 014201c73911$eaedd1b0$19527c0a@OPERAO
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

From: "Magnus Hagander" <magnus(at)hagander(dot)net>
> ITAGAKI Takahiro wrote:
>> Do you mean there are drives that have larger sector size than 8kB?
>> We've already put the xlog buffer along the alignment of
>> ALIGNOF_XLOG_BUFFER (typically 8192 bytes).
>> But if there are such drives, using FILE_FLAG_NO_BUFFERING is
harmful!
>
> Yes. I have heard this can happen with certain SAN drives. I haven't
> seen it myself, and I can't seem to find a reference right now :-)
But I
> do recall having read about th need to check the sector size and
> specifically align it, because some do have that problem.

I think many people can benefit from Itagaki-san's proposal, and
NO_BUFFERING should be default. Isn't it very rare that disks with
sector size larger than 8KB are used? Providing a way (such as
wal_sync_method) to avoid NO_BUFFERING is sufficient for people in
rare environments. Or, by determining the sector size with
GetDiskFreeSpaceEx(), we could auto-switch to not using NO_BUFFERING
when the sector size is larger than 8KB.
#
I wonder whether GetDiskFreeSpaceEx() tells us the right sector size
configured by SAN tools.
And I wonder if Microsoft assumes a sector size larger than 4KB and
NTFS works. The following paragraph appears in the CreateFile page:

One way to align buffers on integer multiples of the volume sector
size is to use VirtualAlloc to allocate the buffers. It allocates
memory that is aligned on addresses that are integer multiples of the
operating system's memory page size. Because both memory page and
volume sector sizes are powers of 2, this memory is also aligned on
addresses that are integer multiples of a volume sector size. Memory
pages are 4-8 KB in size; sectors are 512 bytes (hard disks) or 2048
bytes (CD), and therefore, volume sectors can never be larger than
memory pages.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthew T. O'Connor 2007-01-16 02:36:01 Re: [HACKERS] Autovacuum improvements
Previous Message Takayuki Tsunakawa 2007-01-16 01:20:04 Re: [HACKERS] Checkpoint request failed on version 8.2.1.

Browse pgsql-patches by date

  From Date Subject
Next Message Matthew T. O'Connor 2007-01-16 02:36:01 Re: [HACKERS] Autovacuum improvements
Previous Message L Bayuk 2007-01-16 01:31:18 Re: BCC55 and libpq 8.2