Quick Links

Re: synchronous commit vs. hint bits

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org, YAMAMOTO Takashi <yamt(at)mwd(dot)biglobe(dot)ne(dot)jp>, simon(at)2ndquadrant(dot)com
Subject:	Re: synchronous commit vs. hint bits
Date:	2011-12-01 16:47:52
Message-ID:	CA+TgmoZZrLz9TaZs0kG74o3S3YU_XJNOW5dkq5gakQZ=4RmiMA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Dec 1, 2011 at 9:58 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> Waiting until the other one completes is how it currently is
> implemented, but is it necessary from a correctness view? It seems
> like the WALWriteLock only needs to protect the write, and not the
> sync (assuming the sync method allows those to be separate actions),
> and that there could be multiple fsync requests from different
> processes pending at the same time without a correctness problem.

I've wondered about that, too. At least on Linux, the overhead of a
system call seems to be pretty low - e.g. the ridiculous number of
lseek calls we do on a pgbench -S doesn't seem create much overhead
until the inode mutex starts to become contended; and that problem
should be fixed in Linux 3.2. But I'm not sure if system calls are
similarly cheap on all platforms, or even if it's true on Linux for
fsync() in particular.

There's another possible approach here, too: instead of waiting to set
hint bits until the commit record hits the disk, we could allow the
hint bits to set immediately on the condition that we don't write it
out until the commit record hits the disk. Bumping the page LSN would
do that, but I think that might be problematic since setting hint bits
isn't WAL-logged. If so, we could possibly fix that by storing a
second LSN for the page out of line, e.g. in the buffer descriptor.
That might be even faster than speeding up the WAL flush.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: synchronous commit vs. hint bits at 2011-12-01 14:58:14 from Jeff Janes

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Merlin Moncure	2011-12-01 17:00:00	Re: Add minor version to v3 protocol to allow changes without breaking backwards compatibility
Previous Message	Peter Geoghegan	2011-12-01 16:44:55	Re: Inlining comparators as a performance optimisation