Re: Checkpoint throttling issues

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Subject: Re: Checkpoint throttling issues
Date: 2015-10-22 13:52:25
Message-ID: CA+TgmobZzOkSALE6sHNqO9hrL8Hj=u7VQFx3OpWC8n9zLWBgfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 19, 2015 at 6:10 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> 1) The progress passed to CheckpointWriteDelay() will often be wrong -
> it's calculated as num_written / num_to_write, but num_written is only
> incremented if the buffer hasn't since independently been written
> out. That's bad because it mean's we'll think we're further and
> further behind if there's independent writeout activity.
>
> Simple enough to fix, we gotta split num_written into num_written
> (for stats purposes) and num_processed (for progress).
>
> This is pretty much a bug, but I'm a slightly worried about
> backpatching a fix because it can have a rather noticeable
> behavioural impact.

I think this is an algorithmic improvement, not a bug fix. Actually,
I don't really think any of these things are bugs, properly considered
- they all look pretty intentional to me, even if we no longer agree
with the reasoning. Maybe some of them could be back-patched anyway,
but at any rate I definitely wouldn't backpatch this or #3, because
even though changing this is probably better on the average, it's hard
to be sure that it won't be worse for somebody. In the back-branches,
I think stability takes priority over improvements.

> I think the sleep time should be computed adaptively based on the
> number of buffers remaining and the remaining time. There's probably
> better formulations, but that seems like an easy enough improvement
> and considerably better than now.

One thing to keep in mind here is that somebody did work a few years
ago to reduce the number of wake-ups per second that PostgreSQL
generates when idle. Now obviously getting the checkpointing behavior
correct is more important, and obviously also the system is not idle
if we're checkpointing, but it's something to keep in mind. I like
the idea of an adaptive sleep time.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-10-22 13:54:51 Re: [PROPOSAL] VACUUM Progress Checker.
Previous Message Thom Brown 2015-10-22 13:33:11 Re: Patch (2): Implement failover on libpq connect level.