Re: fdatasync performance problem with large number of DB files

From: Paul Guo <guopa(at)vmware(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Brown <michael(dot)brown(at)discourse(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fdatasync performance problem with large number of DB files
Date: 2021-03-15 14:30:13
Message-ID: 823DAB20-5A5E-4F7B-BF8C-D61FA574DEEA@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 2021/3/15, 7:34 AM, "Thomas Munro" <thomas(dot)munro(at)gmail(dot)com> wrote:

>> On Mon, Mar 15, 2021 at 11:52 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>> Time being of the essence, here is the patch I posted last year, this
>> time with a GUC and some docs. You can set sync_after_crash to
>> "fsync" (default) or "syncfs" if you have it.

> Cfbot told me to add HAVE_SYNCFS to Solution.pm, and I fixed a couple of typos.

By the way, there is a usual case that we could skip fsync: A fsync-ed already standby generated by pg_rewind/pg_basebackup.
The state of those standbys are surely not DB_SHUTDOWNED/DB_SHUTDOWNED_IN_RECOVERY, so the
pgdata directory is fsync-ed again during startup when starting those pg instances. We could ask users to not fsync
during pg_rewind&pg_basebackup, but we probably want to just fsync some files in pg_rewind (see [1]), so better
let the startup process skip the unnecessary fsync? As to the solution, using guc or writing something in some files like
backup_label(?) does not seem to be good ideas since
1. Use guc, we still expect fsync after real crash recovery so we need to reset the guc also need to specify pgoptions in pg_ctl command.
2. Write some hint information to files like backup_label(?) in pg_rewind/pg_basebackup, but people might
copy the pgdata directory and then we still need fsync.
The only one simple solution I can think out is to let user touch a file to hint startup, before starting the pg instance.

[1] https://www.postgresql.org/message-id/flat/25CFBDF2-5551-4CC3-ADEB-434B6B1BAD16%40vmware.com#734e7dc77f0760a3a64e808476ecc592

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Paul Guo 2021-03-15 14:33:24 Re: Freeze the inserted tuples during CTAS?
Previous Message Tom Lane 2021-03-15 14:28:26 Re: Regression tests vs SERIALIZABLE