Re: [Patch] Make block and file size for WAL and relations defined at cluster creation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Remi Colinet <remi(dot)colinet(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Patch] Make block and file size for WAL and relations defined at cluster creation
Date: 2018-01-05 12:42:20
Message-ID: CA+TgmoZjKDKJcpu2nPDKe=sprjNng6T9+Ukhdf_aoC+gFLuzXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 4, 2018 at 5:15 PM, Remi Colinet <remi(dot)colinet(at)gmail(dot)com> wrote:
> Block size does not boils down only to performance.
>
> For instance, having a larger block size allows:
> - to avoid toasting tuples. Rows with sizes larger that the default block
> size can justify larger block sizes.
> - to reduce fragmentation in relations.

Well, I think those are things that you do to improve performance.
So, ultimately, I would argue that it does come down to performance.

> If changing the block size at initdb is useless, then why allowing developer
> to set such block size at compile time?

In my view, right now, changing BLCKSZ is only marginally supported.
It's there so that you can experiment, but it's not really something
we expect users to do. I think we have no buildfarm coverage of
different block sizes. I am not sure that we even consistently fix
regression test failures with other block sizes even if someone
reports them. If there's a bug in some index AM that only manifests
with some non-default block size, we might not know about it. I think
that if we make this an initdb-time option, we're committing to fix
all of those issues: add tests, fix bugs, and of course document how
to set the parameter properly (which means we have to know how it
should be set, which means we have to know what the effects of
changing it are on different systems and workloads). You may or may
not be willing to do some of that work, but I suspect there's a good
chance that it will require effort from other people as well -- e.g.
if we turn up a bug in BRIN, are you going to dive into that and fix
it, or are you going to hope Alvaro does something about it? He'll
probably have to review and commit your patch, at the least.

Of course, it's possible there are no such problems and everything
will just work.

I looked around a little for previous tests that had been run in this
area and found these:

https://blog.pgaddict.com/posts/postgresql-on-ssd-4kb-or-8kB-pages
https://www.cybertec-postgresql.com/en/postgresql-block-sizes-getting-started/
http://blog.coelho.net/database/2014/08/08/postgresql-page-size-for-SSD.html

All of those seem to agree that smaller block sizes can help
performance, sometimes significantly, and larger block sizes hurt
performance, which is sort of surprising to me since that also means
that that your database will get bigger: at a 4kB page size, you have
to store at least twice as many page headers as you would with an 8kB
page size. Some of them also mention reasons why you might want a
larger block size. I believe I recall a mailing-list discussion some
years back about how index pages might need some kind of page-internal
indexing for efficiency with large block sizes, because a simple
binary search might touch too many cache lines. It seems like if we
want to have good performance at a variety of block sizes -- rather
than just having it technically work -- we might need to do a fair
amount of investigation of what factors account for good and bad
performance at a variety of settings and consider whether there are
design changes that might mitigate some of the problems.

I think that if you're interested in making non-default block sizes
more supported in PostgreSQL, some good first steps would be:

- run make check-world at all the supposedly-supported block sizes and
see if it passes.

- set up some buildfarm critters that run with various non-default
block sizes on various hardware and software platforms. ideally we
should have various combinations of 32-bit and 64-bit; Linux, Windows,
and other; and the whole range of block sizes but especially the more
extreme ones.

- run performance tests with a variety of workloads, not just pgbench,
at various block sizes and on various hardware, and post or blog about
the results

If it's clear that non-default block sizes (a) work and (b) are good,
then at least IMHO it's quite likely that we would want this patch.
Maybe those things are already clear to you, but they're not
completely clear to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-01-05 12:54:45 Re: [Patch] Make block and file size for WAL and relations defined at cluster creation
Previous Message Alvaro Herrera 2018-01-05 12:15:36 Re: Enhance pg_stat_wal_receiver view to display connected host