Re: Checkpoint not retrying failed fsync?

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checkpoint not retrying failed fsync?
Date: 2018-04-05 23:34:30
Message-ID: 87tvspi652.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Thomas" == Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:

>> As far as I can tell from reading the code, if a checkpoint fails the
>> checkpointer is supposed to keep all the outstanding fsync requests for
>> next time. Am I wrong, or is there some failure in the logic to do this?

Thomas> Yikes. I think this is suspicious:

Yes, tracing through a checkpoint shows that this is clearly wrong.

Thomas> Why is it OK to unlink the bitmapset? We still need its
Thomas> contents, in the case that the fsync fails!

Right.

But I don't think just copying the value is sufficient; if a new bit was
set while we were processing the old ones, how would we know which to
clear? We couldn't just clear all the bits afterwards because then we
might lose a request.

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-04-05 23:36:39 Re: Checkpoint not retrying failed fsync?
Previous Message Alvaro Herrera 2018-04-05 23:33:55 Re: [HACKERS] path toward faster partition pruning