Re: Two fsync related performance issues?

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Paul Guo <pguo(at)pivotal(dot)io>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Two fsync related performance issues?
Date: 2020-05-26 12:30:49
Message-ID: CAMsr+YH0qzia9yDGsDgsCM8du_HZtiP8utpZnhCHBJ_RrA+ZSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 12 May 2020, 08:42 Paul Guo, <pguo(at)pivotal(dot)io> wrote:

> Hello hackers,
>
> 1. StartupXLOG() does fsync on the whole data directory early in the crash
> recovery. I'm wondering if we could skip some directories (at least the
> pg_log/, table directories) since wal, etc could ensure consistency. Here
> is the related code.
>
> if (ControlFile->state != DB_SHUTDOWNED &&
> ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY)
> {
> RemoveTempXlogFiles();
> SyncDataDirectory();
> }
>

This would actually be a good candidate for a thread pool. Dispatch sync
requests and don't wait. Come back later when they're done.

Unsure if that's at all feasible given that pretty much all the Pg APIs
aren't thread safe though. No palloc, no elog/ereport, etc. However I don't
think we're ready to run bgworkers or use shm_mq etc at that stage.

Of course if OSes would provide asynchronous IO interfaces that weren't
utterly vile and broken, we wouldn't have to worry...

>
> RecreateTwoPhaseFile() writes a state file for a prepared transaction and
> does fsync. It might be good to do fsync for all files once after writing
> them, given the kernel is able to do asynchronous flush when writing those
> file contents. If the TwoPhaseState->numPrepXacts is large we could do
> batching to avoid the fd resource limit. I did not test them yet but this
> should be able to speed up checkpoint/restartpoint a bit.
>

I seem to recall some hints we can set on a FD or mmapped range that
encourage dirty buffers to be written without blocking us, too. I'll have
to look them up...

>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Janes 2020-05-26 13:10:40 max_slot_wal_keep_size comment in postgresql.conf
Previous Message Craig Ringer 2020-05-26 12:17:47 Re: Remove page-read callback from XLogReaderState.