Re: Postgres, fsync, and OSs (specifically linux)

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Asim R P <apraveen(at)pivotal(dot)io>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Postgres, fsync, and OSs (specifically linux)
Date: 2018-09-28 09:37:29
Message-ID: CAEepm=0jipmB3NUy8T8Y98V=Aon8muCdEDvefnXLgaCRFgBGNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 30, 2018 at 2:44 PM Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> On 15 August 2018 at 07:32, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>> I will soon post some more fix-up patches that add EXEC_BACKEND
>> support, Windows support, and a counting scheme to fix the timing
>> issue that I mentioned in my first review. I will probably squash it
>> all down to a tidy patch-set after that.

I went down a bit of a rabbit hole with the Windows support for
Andres's patch set. I have something that works as far as I can tell,
but my Windows environment consists of throwing things at Appveyor and
seeing what sticks, so I'm hoping that someone with a real Windows
system and knowledge will be able to comment.

New patches in this WIP patch set:

0012: Fix for EXEC_BACKEND.

0013: Windows. This involved teaching latch.c to deal with Windows
asynchronous IO events, since you can't wait for pipe readiness via
WSAEventSelect. Pipes and sockets exist in different dimensions on
Windows, and there are no "Unix" domain sockets (well, there are but
they aren't usable yet[1]). An alternative would be to use TCP
sockets for this, and then the code would look more like the Unix
code, but that seems a bit strange. Note that the Windows version
doesn't actually hand off file handles like the Unix code (it could
fairly easily, but there is no reason to think that would actually be
useful on that platform). I may be way off here...

The 0013 patch also fixes a mistake in the 0010 patch: it is not
appropriate to call CFI() while waiting to notify the checkpointer of
a dirty segment, because then ^C could cause the following checkpoint
not to flush dirty data. SendFsyncRequest() is essentially blocking,
except that it uses non-blocking IO so that it multiplex postmaster
death detection.

0014: Fix the ordering race condition mentioned upthread[2]. All
files are assigned an increasing sequence number after [re]opening (ie
before their first write), so that the checkpointer process can track
the fd that must have the oldest Linux f_wb_err that could be relevant
for writes done by PostgreSQL.

The other patches in this tarball are all as posted already, but are
now rebased and assembled in one place. Also pushed to
https://github.com/macdice/postgres/tree/fsyncgate .

Thoughts?

[1] https://blogs.msdn.microsoft.com/commandline/2017/12/19/af_unix-comes-to-windows/
[2] https://www.postgresql.org/message-id/CAEepm%3D04ZCG_8N3m61kXZP-7Ecr02HUNNG-QsAhwyFLim4su2g%40mail.gmail.com

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
fsyncgate-v3.tgz application/x-gzip 30.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2018-09-28 09:44:22 Re: Postgres, fsync, and OSs (specifically linux)
Previous Message Amit Langote 2018-09-28 08:58:49 Re: executor relation handling