Re: fallocate / posix_fallocate for new WAL file creation (etc...)

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: Jon Nelson <jnelson+pgsql(at)jamponi(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Date: 2013-07-01 16:55:51
Message-ID: 1372697751.19747.51.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2013-06-30 at 18:55 -0400, Greg Smith wrote:
> This makes platform level testing a lot easier, thanks. Attached is an
> updated copy of that program with some error checking. If the files it
> creates already existed, the code didn't notice, and a series of write
> errors happened. If you set the test up right it's not a problem, but
> it's better if a bad setup is caught. I wrapped the whole test with a
> shell script, also attached, which insures the right test sequence and
> checks.

Thank you.

> That's glibc helpfully converting your call to posix_fallocate into
> small writes, because the OS doesn't provide a better way in that
> kernel. It's not hard to imagine this being slower than what the WAL
> code is doing right now. I'm not worried about correctness issues
> anymore, but my gut paranoia about this not working as expected on older
> systems was justified. Everyone who thought I was just whining owes me
> a cookie.

So your theory is that it may be slower because there are twice as many
syscalls (one per 4K page rather than one per 8K page)? Interesting
observation.

> This is what I plan to benchmark specifically next.

In the interest of keeping this patch moving forward, do you have an
estimate for when this testing will be complete?

> If the
> posix_fallocate approach is actually slower than what's done now when
> it's not getting kernel acceleration, which is the case on RHEL5 era
> kernels, we might need to make the configure time test more complicated.
> Whether posix_fallocate is defined isn't sensitive enough; on Linux it
> may be the case that this only is usable when fallocate() is also there.

I'd say that if posix_fallocate is slower than the existing code on
pretty much any platform, we shouldn't commit the patch at all. I would
be quite surprised if that was the case, however.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Atri Sharma 2013-07-01 16:56:14 Re: Randomisation for ensuring nlogn complexity in quicksort
Previous Message Alvaro Herrera 2013-07-01 16:55:47 Re: in-catalog Extension Scripts and Control parameters (templates?)