Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: Dimitrios Apostolou <jimis(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward
Date: 2025-06-13 23:15:25
Message-ID: CA+hUKGKAmXYxcrZK9prrez_-LS06Z3FA+_SwTch3cuhxn1cjXw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 11, 2025 at 9:48 AM Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
> So, fseeko() starts winning around 4096 bytes. On macOS, the differences
> aren't quite as dramatic, but 4096 bytes is the break-even point there,
> too. I imagine there's a buffer around that size somewhere...

BTW you can call setvbuf(f, my_buffer, _IOFBF, my_buffer_size) to
control FILE buffering. I suspect that glibc ignores the size if you
pass NULL for my_buffer, so you'd need to allocate it yourself and it
should probably be aligned on PG_IO_ALIGN_SIZE for best results
(minimising the number of VM pages that must be held/pinned). Then
you might be able to get better and less OS-dependent results. I
haven't studied this seek business so I have no opinion on that and
what a good size would be, but interesting sizes might be
rounded to both PG_IO_ALIGN_SIZE and filesystem block size according
to fstat(fileno(stream)). IDK, just a thought...

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2025-06-14 01:41:55 Re: Suggestions for improving \conninfo output in v18
Previous Message Nathan Bossart 2025-06-13 22:31:53 Re: Allow pg_dump --statistics-only to dump foreign table statistics?