Re: [PING] fallocate() causes btrfs to never compress postgresql files

From: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Dimitrios Apostolou <jimis(at)gmx(dot)net>, Magnus Hagander <magnus(at)hagander(dot)net>, Tomas Vondra <tomas(at)vondra(dot)me>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: [PING] fallocate() causes btrfs to never compress postgresql files
Date: 2025-12-15 07:50:00
Message-ID: CAKZiRmzdvqtuZmrpUi-399gzko6xihDgCkJyTv--Sk4pgQ4x+g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 15, 2025 at 7:00 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> Here's a new version with some cleanup and documentation. I tried to
> pare it down to the minimum change for the back-branches, keeping
> unnecessary changes for master. In the process, I also thought a bit
> about how to de-confused matters on Windows, where the function we
> call as ftruncate() behaves differently in a crucial respect. See
> attached.
>
> I'm proposing to back-patch 0001. 0002 and 0003 are proposals for master only.
>

Hi Thomas,

Thanks for working on this. I have reviewed and played a little with
them and they are in very good shape, so +1 from my side. Just couple
of minor things:

1. 0001 I would just add another Discussion there too in commit
message (https://www.postgresql.org/message-id/flat/CADofcAV8xu3hCNHq7-7x56KrP9rD6%3DA04%3DqjTr3nETh-gptF8w%40mail.gmail.com
- XFS thread)
2. I've tested those lightly and they pass my local/built/test. Just a
non-actionable observation from my side: I'm just not sure how useful
the v2-0002 (the new file_extend_method_threshold) is going to be in
real life, for me it sounds like it could be debug_file_extend*...
however that would break convention of using just file_extend
3. I haven't tested 0003 as it is for Windows, probably we could add
it to cfbot, so that it would tell us something more there.

> See below for replies to separate messages from Jakub and Bruce.
>
> On Thu, Oct 30, 2025 at 11:14 PM Jakub Wartak
> <jakub(dot)wartak(at)enterprisedb(dot)com> wrote:
> > +1 to this GUCs as this would also help the nearby thread with XFS
> > mysteries which are not fully solved [1]. Since the latest message in
> > that discussion, I'm aware of at least one additional report of XFS
> > failing at fallocate() with free space too, but without any details
> > from the OS support vendor why that happened, so this $patch could be
> > also used to workaround that problem too.
>
> Yeah, that seems quite important, and the new report in psql-bugs
> #19348 sounds like another case.

Right, I think we've got another report internally too since last time
we've talked, but contact went silent after being redirected to the OS
vendor (after some recommended workaround did not work for them , but
those worked for others).

> > Why just 17? (wasn't fallocate() introduced in 16? 4d330a61bb19 and
> > 31966b151e6ab are from Apr 2023, while 16 was released on Sep 2023)
>
> Right, fixed.

Cool, thanks.

> Yeah. Let's go with PGC_SIGHUP. Let's worry about multiple
> filesystems when we've figured out how to do per-tablespace settings.

Cool, thanks.

> This is vapourware for later, but I've been wondering if we could
> invent a sysctl-style hierarchy as a scoping mechanism, something
> like:
>
> tablespace.foo.random_page_cost=1
> tablespace.foo.file_extend_method=ftruncate
> tablespace.foo.io_combine_limit=1MB

This looks more like sysfs than sysctl (as foo is tablespace name?)
:^). Anyway I think that 0001 should go in and then new thread could
be started for this if You want (as this would be a little conflicting
to stuff we already have: e.g. alter tablespace pg_default set
(maintenance_io_concurrency=XXX), but it is highly unlikely anybody
uses '\db+' in psql see those options there).

-J.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2025-12-15 08:00:00 Re: POC: make mxidoff 64 bits
Previous Message Chao Li 2025-12-15 07:13:52 Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions