Re: fallocate / posix_fallocate for new WAL file creation (etc...)

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Date: 2013-07-05 06:50:01
Message-ID: 1373007001.19747.155.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2013-07-01 at 00:52 -0400, Greg Smith wrote:
> On 6/30/13 9:28 PM, Jon Nelson wrote:
> > The performance of the latter (new) test sometimes seems to perform
> > worse and sometimes seems to perform better (usually worse) than
> > either of the other two. In all cases, posix_fallocate performs
> > better, but I don't have a sufficiently old kernel to test with.
>
> This updated test program looks reliable now. The numbers are very
> tight when I'd expect them to be, and there's nowhere with the huge
> differences I saw in the earlier test program.
>
> Here's results from a few sets of popular older platforms:

...

> The win for posix_fallocate is there in most cases, but it's pretty hard
> to see in these older systems. That could be OK. As long as the
> difference is no more than noise, and that is the case, this could be
> good enough to commit. If there are significantly better results on the
> new platforms, the old ones need to just not get worse.

I ran my own little test on my workstation[1] with the attached
programs. One does what we do now, another mimics the glibc emulation
you described earlier, and another uses posix_fallocate(). It does an
allocation phase, an fsync, a single rewrite, and then another fsync.
The program runs this 64 times for 64 different 16MB files.

write1 and write2 are almost exactly even at 25.5s. write3 is about
14.5s, which is a pretty big improvement.

So, my simple conclusion is that glibc emulation should be about the
same as what we're doing now, so there's no reason to avoid it. That
means, if posix_fallocate() is present, we should use it, because it's
either the same (if emulated in glibc) or significantly faster (if
implemented in the kernel).

If that is your conclusion, as well, it looks like this patch is about
ready for commit. What do you think?

Regards,
Jeff Davis

[1] workstation specs (ubuntu 12.10, ext4):
$ uname -a
Linux jdavis 3.5.0-34-generic #55-Ubuntu SMP Thu Jun 6 20:18:19 UTC 2013
x86_64 x86_64 x86_64 GNU/Linux

Attachment Content-Type Size
write1.c text/x-csrc 576 bytes
write3.c text/x-csrc 552 bytes
write2.c text/x-csrc 591 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2013-07-05 07:22:21 Re: WITH CHECK OPTION for auto-updatable views
Previous Message Arulappan, Arul Shaji 2013-07-05 06:15:04 Re: Proposal - Support for National Characters functionality