Re: Two fsync related performance issues?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Paul Guo <pguo(at)pivotal(dot)io>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Two fsync related performance issues?
Date: 2020-05-19 12:50:42
Message-ID: CA+Tgmoaqnmau3LCsOxMvffrqTtQ=w3A+3KewrJ_ebEDekXaYDg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 11, 2020 at 8:43 PM Paul Guo <pguo(at)pivotal(dot)io> wrote:
> I have this concern since I saw an issue in a real product environment that the startup process needs 10+ seconds to start wal replay after relaunch due to elog(PANIC) (it was seen on postgres based product Greenplum but it is a common issue in postgres also). I highly suspect the delay was mostly due to this. Also it is noticed that on public clouds fsync is much slower than that on local storage so the slowness should be more severe on cloud. If we at least disable fsync on the table directories we could skip a lot of file fsync - this may save a lot of seconds during crash recovery.

I've seen this problem be way worse than that. Running fsync() on all
the files and performing the unlogged table cleanup steps can together
take minutes or, I think, even tens of minutes. What I think sucks
most in this area is that we don't even emit any log messages if the
process takes a long time, so the user has no idea why things are
apparently hanging. I think we really ought to try to figure out some
way to give the user a periodic progress indication when this kind of
thing is underway, so that they at least have some idea what's
happening.

As Tom says, I don't think there's any realistic way that we can
disable it altogether, but maybe there's some way we could make it
quicker, like some kind of parallelism, or by overlapping it with
other things. It seems to me that we have to complete the fsync pass
before we can safely checkpoint, but I don't know that it needs to be
done any sooner than that... not sure though.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-05-19 12:52:07 Re: some grammar refactoring
Previous Message Daniel Gustafsson 2020-05-19 12:33:40 explicit_bzero for sslpassword