Re: Hard limit on WAL space used (because PANIC sucks)

From: Bernd Helmle <mailings(at)oopsware(dot)de>
To: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hard limit on WAL space used (because PANIC sucks)
Date: 2013-06-07 12:18:46
Message-ID: F3D04ABBF3FB984BCC98C06D@apophis.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

--On 6. Juni 2013 16:25:29 -0700 Josh Berkus <josh(at)agliodbs(dot)com> wrote:

> Archiving
> ---------
>
> In some ways, this is the simplest case. Really, we just need a way to
> know when the available WAL space has become 90% full, and abort
> archiving at that stage. Once we stop attempting to archive, we can
> clean up the unneeded log segments.
>
> What we need is a better way for the DBA to find out that archiving is
> falling behind when it first starts to fall behind. Tailing the log and
> examining the rather cryptic error messages we give out isn't very
> effective.

Slightly OT, but i always wondered wether we could create a function, say

pg_last_xlog_removed()

for example, returning a value suitable to be used to calculate the
distance to the current position. An increasing value could be used to
instruct monitoring to throw a warning if a certain threshold is exceeded.

I've also seen people creating monitoring scripts by looking into
archive_status and do simple counts on the .ready files and give a warning,
if that exceeds an expected maximum value.

I haven't looked at the code very deep, but i think we already store the
position of the last removed xlog in shared memory already, maybe this can
be used somehow. Afaik, we do cleanup only during checkpoints, so this all
has too much delay...

--
Thanks

Bernd

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2013-06-07 12:47:29 Re: Proposal for CSN based snapshots
Previous Message Markus Wanner 2013-06-07 11:59:49 Re: Proposal for CSN based snapshots