Re: [PING] fallocate() causes btrfs to never compress postgresql files

From: Dimitrios Apostolou <jimis(at)gmx(dot)net>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Subject: Re: [PING] fallocate() causes btrfs to never compress postgresql files
Date: 2025-06-02 10:14:01
Message-ID: aeba99d6-24a1-92af-380d-926d41b1acc0@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 1 Jun 2025, Thomas Munro wrote:

> Or for a completely different approach: I wonder if ftruncate() would
> be more efficient on COW systems anyway. The minimum thing we need is
> for the file system to remember the new size, 'cause, erm, we don't.
> All the rest is probably a waste of cycles, since they reserve real
> space (or fail to) later in the checkpointer or whatever process
> eventually writes the data out.

FWIW I asked the btrfs devs. From
https://github.com/kdave/btrfs-progs/pull/976
I quote Qu Wenruo:

> Only for falloc(), not ftruncate().
>
> The PREALLOC inode flag is added for any preallocated file extent,
> meanwhile truncate only creates holes.
>
> truncate is fast but it's really different from fallocate by there is
> nothing really allocated.
>
> This means the later writes will need to allocate their own data
> extents. This is fine and even preferred for btrfs, but may lead to
> performance drop for more traditional fses.
>
> We're in an era that fs features are not longer that generic, fallocate
> is just one example, in fact fallocate will cause more problems more
> than no compression.
>
> It's really a deep rabbit hole, and is not something simple true or
> false questions.

In other words, btrfs will not try to allocate anything with ftruncate(),
it will just mark the new space as a "hole". As such, the file is not
marked as "PREALLOC" which is what disables compression. Of course there
is no guarantee that further writes will succeed, and as quoted above,
other (non-COW) filesystems might be slower writing the
ftruncate()-allocated space.

Regards,
Dimitris

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2025-06-02 11:53:43 Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly
Previous Message jian he 2025-06-02 09:56:00 Re: support fast default for domain with constraints