Re: Direct I/O

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Direct I/O
Date: 2022-11-02 00:21:28
Message-ID: 20221102002128.yvq62q7eirwqmks6@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-11-01 15:54:02 -0700, Andres Freund wrote:
> On 2022-11-02 09:44:30 +1300, Thomas Munro wrote:
> > Oh, so BufFile is palloc'd and contains one of these. BufFile is not
> > even using direct I/O, but by these rules it would need to be
> > palloc_io_align'd. I will think about what to do about that...
>
> It might be worth having two different versions of the struct, so we don't
> impose unnecessarily high alignment everywhere?

Although it might actually be worth aligning fully everywhere - there's a
noticable performance difference for buffered read IO.

I benchmarked this on my workstation and laptop.

I mmap'ed a buffer with 2 MiB alignment, MAP_ANONYMOUS | MAP_HUGETLB, and then
measured performance of reading 8192 bytes into the buffer at different
offsets. Each time I copied 16GiB in total. Within a program invocation I
benchmarked each offset 4 times, threw away the worst measurement, and
averaged the rest. Then used the best of three program invocations.

workstation with dual xeon Gold 5215:

turbo on turbo off
offset GiB/s GiB/s
0 18.358 13.528
8 15.361 11.472
9 15.330 11.418
32 17.583 13.097
512 17.707 13.229
513 15.890 11.852
4096 18.176 13.568
8192 18.088 13.566
2Mib 18.658 13.496

laptop with i9-9880H:

turbo on turbo off
offset GiB/s GiB/s
0 33.589 17.160
8 28.045 14.301
9 27.582 14.318
32 31.797 16.711
512 32.215 16.810
513 28.864 14.932
4096 32.503 17.266
8192 32.871 17.277
2Mib 32.657 17.262

Seems pretty clear that using 4096 byte alignment is worth it.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2022-11-02 01:08:38 Re: Adding doubly linked list type which stores the number of items in the list
Previous Message Justin Pryzby 2022-11-02 00:20:25 Re: Glossary and initdb definition work for "superuser" and database/cluster