Re: Group commit, revised

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Group commit, revised
Date: 2012-01-29 21:20:07
Message-ID: 4F25B807.5020103@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/28/2012 07:48 PM, Jeff Janes wrote:
> Others are going to test this out on high-end systems. I wanted to
> try it out on the other end of the scale. I've used a Pentium 4,
> 3.2GHz,
> with 2GB of RAM and with a single IDE drive running ext4. ext4 is
> amazingly bad on IDE, giving about 25 fsync's per second (and it lies
> about fdatasync, but apparently not about fsync)

Fantastic, I had to stop for a minute to check the date on your message
for a second there, make sure it hadn't come from some mail server
that's been backed up on delivery the last five years. I'm cleaning
house toward testing this out here, and I was going to test on the same
system using both fast and horribly slow drives. Both ends of the scale
are important, and they benefit in a very different way from these changes.

> I haven't inspected that deep fall off at 30 clients for the patch.
> By way of reference, if I turn off synchronous commit, I get
> tps=1245.8 which is 100% CPU limited. This sets an theoretical upper
> bound on what could be achieved by the best possible group committing
> method.

This sort of thing is why I suspect that to completely isolate some
results, we're going to need a moderately high end server--with lots of
cores--combined with an intentionally mismatched slow drive. It's too
easy to get pgbench and/or PostgreSQL to choke on something other than
I/O when using smaller core counts. I don't think I have anything where
the floor is 24 TPS per client though. Hmmm...I think I can connect an
IDE drive to my MythTV box and format it with ext4. Thanks for the test
idea.

One thing you could try on this system is using the -N "Do not update
pgbench_tellers and pgbench_branches". That eliminates a lot of the
contention that might be pulling down your higher core count tests,
while still giving a completely valid test of whether the group commit
mechanism works. Not sure whether that will push up the top-end
usefully for you, worth a try if you have time to test again.

> If the group_commit patch goes in, would we then rip out commit_delay
> and commit_siblings?

The main reason those are still hanging around at all are to allow
pushing on the latency vs. throughput trade-off on really busy systems.
The use case is that you expect, say, 10 clients to constantly be
committing at a high rate. So if there's just one committing so far,
assume it's the leading edge of a wave and pause a bit for the rest to
come in. I don't think the cases where this is useful behavior--people
both want it and the current mechanism provides it--are very common in
the real world. It can be useful for throughput oriented benchmarks
though, which is why I'd say it hasn't killed off yet.

We'll have to see whether the final form this makes sense in will
usefully replace that sort of thing. I'd certainly be in favor of
nuking commit_delay and commit_siblings with a better option; it would
be nice if we don't eliminate this tuning option in the process though.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2012-01-29 21:39:48 Re: GiST for range types (was Re: Range Types - typo + NULL string constructor)
Previous Message Simon Riggs 2012-01-29 20:18:05 Re: CLOG contention, part 2