Re: Checkpoint not retrying failed fsync?

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checkpoint not retrying failed fsync?
Date: 2018-04-05 23:17:48
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Fri, Apr 6, 2018 at 10:16 AM, Andrew Gierth
<andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
> Furthermore, checking the trace output from the checkpointer process, it
> is not even attempting an fsync of the failing file; this isn't like the
> Linux fsync issue, I've confirmed that fsync will repeatedly fail on the
> file until the underlying errors stop.

Thank you for confirming that! Now, how does one go about buying
shares in FreeBSD?

> As far as I can tell from reading the code, if a checkpoint fails the
> checkpointer is supposed to keep all the outstanding fsync requests for
> next time. Am I wrong, or is there some failure in the logic to do this?

Yikes. I think this is suspicious:

* The bitmap manipulations are slightly tricky,
because we can call
* AbsorbFsyncRequests() inside the loop and that
could result in
* bms_add_member() modifying and even re-palloc'ing
the bitmapsets.
* This is okay because we unlink each bitmapset from
the hashtable
* entry before scanning it. That means that any incoming fsync
* requests will be processed now if they reach the
table before we
* begin to scan their fork.

Why is it OK to unlink the bitmapset? We still need its contents, in
the case that the fsync fails!

Thomas Munro

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-04-05 23:33:55 Re: [HACKERS] path toward faster partition pruning
Previous Message Claudio Freire 2018-04-05 22:59:59 Re: Vacuum: allow usage of more than 1GB of work mem