BUG #14416: checkpoints never completed

From: jdnelson(at)dyn(dot)com
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #14416: checkpoints never completed
Date: 2016-11-07 18:27:07
Message-ID: 20161107182707.1393.91785@wrigleys.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 14416
Logged by: Jon Nelson
Email address: jdnelson(at)dyn(dot)com
PostgreSQL version: 9.4.9
Operating system: Linux
Description:

PostgreSQL 9.4.9 on CentOS-7, x86_64, on ext4.

We encountered a problem where checkpoints appeared to queue up and never
finish.

The logs show that checkpoints were happening normally, once per minute
(which is our desired configuration in this case).

01:08:27 the system clock was synchronized. It was behind by about one
hour.
01:23:31 a backup (pg_start_backup) is started
03:40:14 we saw our first “out of disk space” message
03:47:51 we saw “checkpoint starting: time” (with no corresponding
“checkpoint complete” messages) *once per second*. We are also out of disk
space at this time.
04:00:01 we see a ‘checkpoint complete’ message (despite still being out of
disk space!)

Checkpoints appear normal until 4:07:01 at which point the the “checkpoint
start: time” message occurs once per *second*. Still out of disk space.

05:51:13 we see the first “checkpoint complete”.
05:51:22 (7 seconds later) we see a “checkpoint starting: time” message. No
messages containing "checkpoint" appear after this.

07:43:54 an error message indicating that “pg_stop_backup()” says a backup
isn’t in progress. NOTE: we do *not* see a successful pg_stop_backup(), but
it’s possible that it took less than 500ms which is our
log_min_duration_statement threshold.

23:19:32 our attempts at a manual CHECKPOINT all hang, and the only messages
with ‘checkpoint’ in them are notices of our cancellations.

We chose to shut the instance down. Over 400GB of WAL files had accrued due
to no checkpoint completing. The instance came up without issue and resumed
operations.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2016-11-07 19:22:15 Re: BUG #14416: checkpoints never completed
Previous Message Dilip Kumar 2016-11-07 15:12:19 Re: [BUGS] BUG #14350: VIEW with INSTEAD OF INSERT TRIGGER and COPY. Missing feature or working as designed.