Re: Two fsync related performance issues?

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Paul Guo <pguo(at)pivotal(dot)io>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Two fsync related performance issues?
Date: 2020-09-03 00:09:29
Message-ID: CA+hUKGLKC1OeDKie6ikqKZSZbkZbiQhhSsDZL8upHLJvtgaXyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 12, 2020 at 12:43 PM Paul Guo <pguo(at)pivotal(dot)io> wrote:
> 2. CheckPointTwoPhase()
>
> This may be a small issue.
>
> See the code below,
>
> for (i = 0; i < TwoPhaseState->numPrepXacts; i++)
> RecreateTwoPhaseFile(gxact->xid, buf, len);
>
> RecreateTwoPhaseFile() writes a state file for a prepared transaction and does fsync. It might be good to do fsync for all files once after writing them, given the kernel is able to do asynchronous flush when writing those file contents. If the TwoPhaseState->numPrepXacts is large we could do batching to avoid the fd resource limit. I did not test them yet but this should be able to speed up checkpoint/restartpoint a bit.

Hi Paul,

I hadn't previously focused on this second part of your email. I
think the fsync() call in RecreateTwoPhaseFile() might be a candidate
for processing by the checkpoint code through the new facilities in
sync.c, which effectively does something like what you describe. Take
a look at the code I'm proposing for slru.c in
https://commitfest.postgresql.org/29/2669/ and also in md.c. You'd
need a way to describe the path of these files in a FileTag struct, so
you can defer the work; it will invoke your callback to sync the file
as part of the next checkpoint (or panic if it can't). You also need
to make sure to tell it to forget the sync request before you unlink
the file. Although it still has to call fsync one-by-one on the files
at checkpoint time, by deferring it until then there is a good chance
the kernel has already done the work so you don't have to go off-CPU
at all. I hope that means we'd often never have to perform the fsync
at all, because the file is usually gone before we reach the
checkpoint. Do I have that right?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2020-09-03 00:18:23 Re: Disk-based hash aggregate's cost model
Previous Message Tatsuro Yamada 2020-09-02 23:45:17 Re: list of extended statistics on psql