Quick Links

checkpoint writeback via sync_file_range

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org, Greg Smith <greg(at)2ndquadrant(dot)com>
Subject:	checkpoint writeback via sync_file_range
Date:	2012-01-11 02:14:31
Message-ID:	CA+TgmoaHu1zuNohoE=cEP0nSc+0wtuRSyEAj_Af2XhxU+ry6-w@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Greg Smith muttered a while ago about wanting to do something with
sync_file_range to improve checkpoint behavior on Linux. I thought he
was talking about trying to sync only the range of blocks known to be
dirty, which didn't seem like a very exciting idea, but after looking
at the man page for sync_file_range, I think I understand what he was
really going for: sync_file_range allows you to hint the Linux kernel
that you'd like it to clean a certain set of pages. I further recall
from Greg's previous comments that in the scenarios he's seen,
checkpoint I/O spikes are caused not so much by the data written out
by the checkpoint itself but from the other dirty data in the kernel
buffer cache. Based on that, I whipped up the attached patch, which,
if sync_file_range is available, simply iterates through everything
that will eventually be fsync'd before beginning the write phase and
tells the Linux kernel to put them all under write-out.

I don't know that I have a suitable place to test this, and I'm not
quite sure what a good test setup would look like either, so while
I've tested that this appears to issue the right kernel calls, I am
not sure whether it actually fixes the problem case. But here's the
patch, anyway.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment	Content-Type	Size
writeback-v1.patch	application/octet-stream	13.2 KB

Responses

Re: checkpoint writeback via sync_file_range at 2012-01-11 04:38:12 from Greg Smith
Re: checkpoint writeback via sync_file_range at 2012-01-11 12:46:29 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Stefan Keller	2012-01-11 02:16:07	Re: Real-life range datasets
Previous Message	Robert Haas	2012-01-11 02:04:46	Re: JSON for PG 9.2