Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Farina <daniel(at)heroku(dot)com>, Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>, Harold A(dot) Giménez <harold(dot)gimenez(at)gmail(dot)com>
Subject: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)
Date: 2012-07-17 22:56:50
Message-ID: 9247.1342565810@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Mon, Jul 16, 2012 at 3:18 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> BTW, while we are on the subject: hasn't this split completely broken
>> the statistics about backend-initiated writes?

> Yes, it seems to have done just that.

So I went to fix this in the obvious way (attached), but while testing
it I found that the number of buffers_backend events reported during
a regression test run barely changed; which surprised the heck out of
me, so I dug deeper. The cause turns out to be extremely scary:
ForwardFsyncRequest isn't getting called at all in the bgwriter process,
because the bgwriter process has a pendingOpsTable. So it just queues
its fsync requests locally, and then never acts on them, since it never
runs any checkpoints anymore.

This implies that nobody has done pull-the-plug testing on either HEAD
or 9.2 since the checkpointer split went in (2011-11-01), because even
a modicum of such testing would surely have shown that we're failing to
fsync a significant fraction of our write traffic.

Furthermore, I would say that any performance testing done since then,
if it wasn't looking at purely read-only scenarios, isn't worth the
electrons it's written on. In particular, any performance gain that
anybody might have attributed to the checkpointer splitup is very
probably hogwash.

This is not giving me a warm feeling about our testing practices.

As far as fixing the bug is concerned, the reason for the foulup
is that mdinit() looks to IsBootstrapProcessingMode() to decide
whether to create a pendingOpsTable. That probably was all right
when it was coded, but what it means today is that *any* process
started via AuxiliaryProcessMain will have one; thus not only do
bgwriters have one, but so do walwriter and walreceiver processes;
which might not represent a bug today but it's pretty scary anyway.
I think we need to fix that so it's more directly dependent on the
auxiliary process type. We can't use flags set by the respective
FooMain() functions, such as am_bg_writer, because mdinit is called
from BaseInit() which happens before reaching those functions.
My suggestion is that bootstrap.c ought to make the process's
AuxProcType value available and then mdinit should consult that to
decide what to do. (Having done that, we might consider getting rid
of the "retail" process-type flags am_bg_writer etc.)

regards, tom lane

Attachment Content-Type Size
bgwriter-stats-fix-incomplete.patch text/x-patch 3.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-07-17 23:39:15 Re: BUG #6733: All Tables Empty After pg_upgrade (PG 9.2.0 beta 2)
Previous Message Bruce Momjian 2012-07-17 22:02:40 Re: Using pg_upgrade on log-shipping standby servers

Browse pgsql-performance by date

  From Date Subject
Next Message Peter Geoghegan 2012-07-17 23:48:50 Re: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)
Previous Message Scott Marlowe 2012-07-17 21:00:30 Re: Slow application response on lightly loaded server?