From: | ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [PATCHES] O_DIRECT for WAL writes |
Date: | 2005-06-28 07:21:10 |
Message-ID: | 20050628161732.402D.ITAGAKI.TAKAHIRO@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Yeah, this is about what I was afraid of: if you're actually fsyncing
> then you get at best one commit per disk revolution, and the negotiation
> with the OS is down in the noise.
If we disable writeback-cache and use open_sync, the per-page writing
behavior in WAL module will show up as bad result. O_DIRECT is similar
to O_DSYNC (at least on linux), so that the benefit of it will disappear
behind the slow disk revolution.
In the current source, WAL is written as:
for (i = 0; i < N; i++) { write(&buffers[i], BLCKSZ); }
Is this intentional? Can we rewrite it as follows?
write(&buffers[0], N * BLCKSZ);
In order to achieve it, I wrote a 'gather-write' patch (xlog.gw.diff).
Aside from this, I'll also send the fixed direct io patch (xlog.dio.diff).
These two patches are independent, so they can be applied either or both.
I tested them on my machine and the results as follows. It shows that
direct-io and gather-write is the best choice when writeback-cache is off.
Are these two patches worth trying if they are used together?
| writeback | fsync= | fdata | open_ | fsync_ | open_
patch | cache | false | sync | sync | direct | direct
------------+-----------+--------+-------+-------+--------+---------
direct io | off | 124.2 | 105.7 | 48.3 | 48.3 | 48.2
direct io | on | 129.1 | 112.3 | 114.1 | 142.9 | 144.5
gather-write| off | 124.3 | 108.7 | 105.4 | (N/A) | (N/A)
both | off | 131.5 | 115.5 | 114.4 | 145.4 | 145.2
- 20runs * pgbench -s 100 -c 50 -t 200
- with tuning (wal_buffers=64, commit_delay=500, checkpoint_segments=8)
- using 2 ATA disks:
- hda(reiserfs) includes system and wal.
- hdc(jfs) includes database files. writeback-cache is always on.
---
ITAGAKI Takahiro
NTT Cyber Space Laboratories
Attachment | Content-Type | Size |
---|---|---|
xlog.dio.diff | application/octet-stream | 4.5 KB |
xlog.gw.diff | application/octet-stream | 7.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Teodor Sigaev | 2005-06-28 07:22:38 | Re: contrib/rtree_gist into core system? |
Previous Message | Tom Lane | 2005-06-28 07:08:38 | Re: Wierd panic with 7.4.7 |
From | Date | Subject | |
---|---|---|---|
Next Message | Dave Page | 2005-06-28 08:49:09 | For review: Server instrumentation patch |
Previous Message | Fabien COELHO | 2005-06-28 06:50:00 | Re: Users/Groups -> Roles |