Refactoring the checkpointer's fsync request queue

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Shawn Debnath <sdn(at)amazon(dot)com>
Subject: Refactoring the checkpointer's fsync request queue
Date: 2018-10-15 11:02:17
Message-ID: CAEepm=2gTANm=e3ARnJT=n0h8hf88wqmaZxk0JYkxw+b21fNrw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hello hackers,

Currently, md5.c and checkpointer.c interact in a way that breaks
smgr.c's modularity. That doesn't matter much if md.c is the only
storage manager implementation, but currently there are two proposals
to provide new kinds of block storage accessed via the buffer manager:
UNDO and SLRU.

Here is a patch that rips the fsync stuff out of md.c, generalises it
and puts it into a new translation unit smgrsync.c. It can deal with
fsync()ing any files you want at checkpoint time, as long as they can
be described by a SmgrFileTag (a struct type we can extend as needed).
A pathname would work too, but I wanted something small and fixed in
size. It's just a tag that can be converted to a path in case it
needs to be reopened (eg on Windows), but otherwise is used as a hash
table key to merge requests.

There is one major fly in the ointment: fsyncgate[1]. Originally I
planned to propose a patch on top of that one, but it's difficult --
both patches move a lot of the same stuff around. Personally, I don't
think it would be a very good idea to back-patch that anyway. It'd be
riskier than the problem it aims to solve, in terms of bugs and
hard-to-foresee portability problems IMHO. I think we should consider
back-patching some variant of Craig Ringer's PANIC patch, and consider
this redesigned approach for future releases.

So, please find attached the WIP patch that I would like to propose
for PostgreSQL 12, under a separate Commitfest entry. It incorporates
the fsyncgate work by Andres Freund (original file descriptor transfer
POC) and me (many bug fixes and improvements), and the refactoring
work as described above.

It can be compiled in two modes: with the macro
CHECKPOINTER_TRANSFER_FILES defined, it sends fds to the checkpointer,
but if you comment out that macro definition for testing, or build on
Windows, it reverts to a mode that reopens files in the checkpointer.

I'm hoping to find a Windows-savvy collaborator to help finish the
Windows support. Right now it passes make check on AppVeyor, but it
needs to be reviewed and tested on a real system with a small
shared_buffers (installcheck, pgbench, other attempts to break it).
Other than that, there are a couple of remaining XXX notes for small
known details, but I wanted to post this version now.

[1] https://postgr.es/m/20180427222842.in2e4mibx45zdth5%40alap3.anarazel.de

--
Thomas Munro
http://www.enterprisedb.com

Attachment Content-Type Size
0001-Refactor-the-checkpointer-request-queue.patch application/octet-stream 112.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2018-10-15 12:39:45 Re: Undo logs
Previous Message John Naylor 2018-10-15 10:39:34 Re: WIP: Avoid creation of the free space map for small tables