Re: [Patch] Windows relation extension failure at 2GB and 4GB

From: Bryan Green <dbryan(dot)green(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [Patch] Windows relation extension failure at 2GB and 4GB
Date: 2025-11-06 14:56:14
Message-ID: 31aeb34d-66c7-456f-b59a-9e2b03940e4a@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/6/2025 3:20 AM, Thomas Munro wrote:
> On Wed, Oct 29, 2025 at 3:42 AM Bryan Green <dbryan(dot)green(at)gmail(dot)com> wrote:
>> That said, I'm finding off_t used in many other places throughout the
>> codebase - buffile.c, various other file utilities such as backup and
>> archive, probably more. This is likely causing latent bugs elsewhere on
>> Windows, though most are masked by the 1GB default segment size. I'm
>> investigating the full scope, but I think this needs to be broken up
>> into multiple patches. The core file I/O layer (fd.c, md.c,
>> pg_pwrite/pg_pread) should probably go first since that's what's
>> actively breaking file extension.
>
> The way I understand this situation, there are two kinds of file I/O,
> with respect to large files:
>
> 1. Some places *have* to deal with large files (eg navigating in a
> potentially large tar file), and there we should already be using
> pgoff_t and the relevant system call wrappers should be using the
> int64_t stuff Windows provides. These are primarily frontend code.
> 2. Some places use segmentation *specifically because* there are
> systems with 32 bit off_t. These are mostly backend code dealing with
> relation data files. The only system left with narrow off_t is
> Windows.
>
> In reality the stuff in category 1 has been developed through a
> process of bug reports and patches (970b97e and 970b97e^ springs to
> mind as the most recent case I had something to with, but see also
> stat()-related stuff, and see aa5518304 where we addressed the one
> spot in buffile.c that had to consider multiple segments). But the
> fact that Windows can't use segments > 2GB because the fd.c and
> smgr.c/md.c layers work with off_t is certainly a well known
> limitation, ie specifically that relation and temporary/buf files are
> special in this way. I'm mostly baffled by the fact that --relsegsize
> actually *lets* you set it higher than 2 on that platform. Perhaps we
> should at least backpatch a configure check or static assertion to
> block that? It's not good if it compiles but doesn't actually work.
>

I agree that the backpatch should just block setting -relsegsize > 2GB
on Windows.

> For master I think it makes sense to clean this up, as you say,
> because the fuzzy boundary between the two categories of file I/O is
> bound to cause more problems, it's just unfinished business that has
> been tackled piecemeal as required by bug reports... In fact, on a
> thread[1] where I explored making the segment size a runtime option
> specified at initdb time, I even posted patches much like yours in the
> first version, spreading pgoff_t into more places, and then in a later
> version it was suggested that it might be better to just block
> settings that are too big for your off_t, so I did that. I probably
> thought that we already did that somewhere for the current
> compile-time constant...
>

For master, I'd like to proceed with the cleanup approach - spreading
pgoff_t into the core I/O layer (fd.c, md.c, pg_pread/pg_pwrite
wrappers, etc). That would let us eliminate the artificial 2GB ceiling
on Windows and clean up the file I/O category boundary.

>> Not urgent since few people hit this in practice, but it's clearly wrong
>> code.
>
> Yeah. In my experience dealing with bug reports, the Windows users
> community skews very heavily towards just consuming EDB's read-built
> installer. We rarely hear about configuration-level problems, so I
> suppose it's not surprising that no one has ever complained that it
> lets you configure it in a way that we hackers all know is certainly
> going to break.
>
> [1] https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BBGXwMbrvzXAjL8VMGf25y_ga_XnO741g10y0%3Dm6dDiA%40mail.gmail.com

Thanks for the feedback.

--
Bryan Green
EDB: https://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Vaibhav Dalvi 2025-11-06 14:58:31 Re: [PATCH] Add pg_get_subscription_ddl() function
Previous Message Laurenz Albe 2025-11-06 14:54:23 Re: ago(interval) → timestamptz