Re: file cloning in pg_upgrade and CREATE DATABASE

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: file cloning in pg_upgrade and CREATE DATABASE
Date: 2018-03-30 20:15:39
Message-ID: c9c905e4-dae6-6c77-2ad4-b4dc497b6b86@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/26/18 02:15, Michael Paquier wrote:
> f8c183a has introduced the optimization that your patch is removing,
> which was discussed on this thread:
> https://www.postgresql.org/message-id/flat/4B78906A.7020309%40mark.mielke.cc

Note that that thread is from 2010 and talks about creation of a
database from the standard template being too slow on spinning rust,
because we fsync too often. I think we have moved well past that
problem size.

I have run some more tests on both macOS and Linux with ext4, and my
results are that the bigger the flush distance, the better. Before we
made the adjustments for APFS, we had a flush size of 64kB, now it's 1MB
and 32MB on macOS. In my tests, I see 256MB as the best across both
platforms, and not flushing early at all is only minimally worse.

You can measure this to death, and this obviously doesn't apply equally
on all systems and configurations, but clearly some of the old
assumptions from 8 years ago are no longer applicable.

> I am not much into the internals of copy_file_range, but isn't there a
> risk to have a large range of blocks copied to discard potentially
> useful blocks from the OS cache? That's what this patch makes me worry
> about. Performance is good, but on a system where the OS cache is
> heavily used for a set of hot blocks this could cause performance side
> effects that I think we canot neglect.

How would we go about assessing that? It's possible, but if
copy_file_range() really blows away all your in-use cache, that would be
surprising.

> Another thing is that 71d6d07 allowed a couple of database commands to
> be more sensitive to interruptions. With large databases used as a base
> template it seems to me that this would cause the interruptions to be
> less responsive.

The maximum file size that we copy is 1GB and that nowadays takes maybe
10 seconds. I think that would be an acceptable response time.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-03-30 20:21:24 Re: [HACKERS] AdvanceXLInsertBuffer vs. WAL segment compressibility
Previous Message Tom Lane 2018-03-30 19:14:22 Re: [HACKERS] pg_serial early wraparound