Re: Expose checkpoint start/finish times into SQL.

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: Expose checkpoint start/finish times into SQL.
Date: 2008-04-04 12:37:49
Message-ID: 1207312669.4256.22.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

On Fri, 2008-04-04 at 02:21 -0400, Greg Smith wrote:

> Database stops checkpointing. WAL files pile up. In the middle of
> backup, system finally dies, and when it starts recovery there's a bad
> record in the WAL files--which there are now thousands of to apply, and
> the bad one is 4 hours of replay in. Believe it or not, it goes downhill
> from there.
>
> It's what kicked off the first step that's the big mystery. The only code
> path I thought of that can block checkpoints like this is when the
> archive_command isn't working anymore, and that wasn't being used. Given
> some of the other corruption found later and the bad memory issues
> discovered, a bit flipping in the "do I need to checkpoint now?" code or
> data seems just as likely as any other explanation.

A few additional comments here:

If you set checkpoint_segments very, very high you can avoid a
checkpoint via checkpoint_timeout for up to 60 minutes. If you did this
for performance reasons presumably you've got lots of WAL files and
might end up with 1000s of them in that time period.

If you set it too high, you hit the disk limits first and can then crash
the server if the pg_xlog directory's physical limits are unluckily low
enough.

Starvation of the checkpoint start lock has been witnessed previously,
so if you're running 8.2 or previous that could be a possible
explanation here. What can happen is that a checkpoint is triggered yet
the bgwriter needs to wait to get access to the CheckpointStartLock. I
witnessed a starvation of 3 minutes once during testing a server running
at max velocity with 200 users, in 2006. I assumed that was an outlier,
but its possible for that to be longer. I wouldn't believe too much
longer, though. That was patched in 8.3 as a result.

Anyway, either of those factors, or their combination, plus a small
pg_xlog disk would be sufficient to explain the crash and wal file build
up.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Alvaro Herrera 2008-04-04 13:50:37 Re: Expose checkpoint start/finish times into SQL.
Previous Message Ceschia, Marcello 2008-04-04 07:20:16 Re: [PATCHES] Re: BUG #4070: Join more then ~15 tables let postgreSQL produces wrong data