Re: our buffer replacement strategy is kind of lame

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: our buffer replacement strategy is kind of lame
Date: 2012-01-20 14:59:41
Message-ID: CA+TgmoZ3rUYQNz1qa2h4tg36YWSjCnhGxSJVaUz6zpyjubJxfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 20, 2012 at 9:29 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> I'd like to see some benchmarks that show a benefit from these patches,
> before committing something like this that complicates the code. These
> patches are fairly small, but nevertheless. Once we have a test case, we can
> argue whether the benefit we're seeing is worth the extra code, or if
> there's some better way to achieve it.

Agreed. I don't really like either of these patches stylistically:
when you start sometimes locking things and sometimes not locking
things, it gets hard to reason about how the code will behave, and you
limit what you can in the future do inside the critical section
because you have to keep in mind that you don't really have full
mutual exclusion. There are places where it may be worth making that
trade-off if we can show a large performance benefit, but I am wary of
quick hacks that may get in the way of more complete solutions we want
to implement later, especially if the performance benefit is marginal.

I'm in the process of trying to pull together benchmark numbers for
the various performance-related patches that are floating around this
CommitFest, but a big problem is that many of them need to be tested
in different ways, which are frequently not explicitly laid out in the
emails in which they are submitted. Some of them are calculated to
improve latency, and others throughput. Some require pgbench run with
a small scale factor, some require pgbench with a large scale factor,
and some need some completely other type of testing altogether. Some
need short tests, others need long tests, still others require testing
with very specific parameters, and some don't really need more testing
so much as they need profiling. Many (like the double-write patch)
need more thought about worst case scenarios, like a sequential scan
on a large, unhinted table. Some only really need testing in one or
two scenarios, but others could affect a broad variety of workloads
and really ought to be tested in many different ways to fully
understand the behavior. Some of them may be interdependent in
complex ways.

My current belief, based on the testing I did before Christmas and my
gut feelings about the remaining patches, is that the patches which
have the best chance of making a major difference on general workloads
are your XLOG insert scaling patch and the group commit patch. Those
are, in fact, virtually the only ones that have had more than zero
test results posted to the list so far, and those test results are
extremely promising. The various checkpoint-related patches also seem
somewhat promising to me, despite the lack (so far) of any test
results. Everything else strikes me as likely to make a difference
that is either very small or only applicable in a fairly circumscribed
set of cases. This preliminary opinion is subject to change as more
evidence comes in, of course. :-)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-01-20 15:01:47 Re: Patch review for logging hooks (CF 2012-01)
Previous Message Marko Kreen 2012-01-20 14:49:45 Re: Speed dblink using alternate libpq tuple storage