| From: | Greg Smith <gsmith(at)gregsmith(dot)com> | 
|---|---|
| To: | Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> | 
| Cc: | Marco Colombo <pgsql(at)esiway(dot)net>, John R Pierce <pierce(at)hogranch(dot)com>, pgsql-general(at)postgresql(dot)org | 
| Subject: | Re: Maximum transaction rate | 
| Date: | 2009-03-17 23:12:32 | 
| Message-ID: | alpine.GSO.2.01.0903171846590.21265@westnet.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
On Tue, 17 Mar 2009, Ron Mayer wrote:
> I wonder if there should be an optional fsync mode
> in postgres should turn fsync() into
>    fchmod (fd, 0644); fchmod (fd, 0664);
> to work around this issue.
The test I haven't had time to run yet is to turn the bug exposing program 
you were fiddling with into a more accurate representation of WAL 
activity, to see if that chmod still changes the behavior there. I think 
the most dangerous possibility here is if you create a new WAL segment and 
immediately fill it, all in less than a second.  Basically, what 
XLogFileInit does:
-Open with O_RDWR | O_CREAT | O_EXCL
-Write XLogSegSize (16MB) worth of zeros
-fsync
Followed by simulating what XLogWrite would do if you fed it enough data 
to force a segment change:
-Write a new 16MB worth of data
-fsync
If you did all that in under a second, would you still get a filesystem 
flush each time?  From the description of the problem I'm not so sure 
anymore.  I think that's how tight the window would have to be for this 
issue to show up right now, you'd only be exposed if you filled a new WAL 
segment faster than the associated journal commit happened (basically, a 
crash when WAL write volume >16MB/s in a situation where new segments are 
being created).  But from what I've read about ext4 I think that window 
for mayhem might widen on that filesystem--that's what got me reading up 
on this whole subject recently, before this thread even started.
The other ameliorating factor here is that in order for this to bite you, 
I think you'd need to have another, incorrectly ordered write somewhere 
else that could happen before the delayed write.  Not sure where that 
might be possible in the PostgreSQL WAL implementation yet.
--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Marco Colombo | 2009-03-18 00:22:10 | Re: Maximum transaction rate | 
| Previous Message | Simon Riggs | 2009-03-17 22:35:04 | Re: What are the benefits of using a clustered index? |