Vectored I/O in bulk_write.c

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Vectored I/O in bulk_write.c
Date: 2024-03-10 00:20:06
Message-ID: CA+hUKGLx5bLwezZKAYB2O_qHj=ov10RpgRVY7e8TSJVE74oVjg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I was trying to learn enough about the new bulk_write.c to figure out
what might be going wrong over at [1], and finished up doing this
exercise, which is experiment quality but passes basic tests. It's a
bit like v1-0013 and v1-0014's experimental vectored checkpointing
from [2] (which themselves are not currently proposed, that too was in
the experiment category), but this usage is a lot simpler and might be
worth considering. Presumably both things would eventually finish up
being done by (not yet proposed) streaming write, but could also be
done directly in this simple case.

This way, CREATE INDEX generates 128kB pwritev() calls instead of 8kB
pwrite() calls. (There's a magic 16 in there, we'd probably need to
think harder about that.) It'd be better if bulk_write.c's memory
management were improved: if buffers were mostly contiguous neighbours
instead of being separately palloc'd objects, you'd probably often get
128kB pwrite() instead of pwritev(), which might be marginally more
efficient.

This made me wonder why smgrwritev() and smgrextendv() shouldn't be
backed by the same implementation, since they are essentially the same
operation. The differences are some assertions which might as well be
moved up to the smgr.c level as they must surely apply to any future
smgr implementation too, right?, and the segment file creation policy
which can be controlled with a new argument. I tried that here. An
alternative would be for md.c to have a workhorse function that both
mdextendv() and mdwritev() call, but I'm not sure if there's much
point in that.

While thinking about that I realised that an existing write-or-extend
assertion in master is wrong because it doesn't add nblocks.

Hmm, it's a bit weird that we have nblocks as int or BlockNumber in
various places, which I think should probably be fixed.

[1] https://www.postgresql.org/message-id/flat/CA%2BhUKGK%2B5DOmLaBp3Z7C4S-Yv6yoROvr1UncjH2S1ZbPT8D%2BZg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com

Attachment Content-Type Size
0001-Provide-vectored-variant-of-smgrextend.patch text/x-patch 10.8 KB
0002-Use-vectored-I-O-for-bulk-writes.patch text/x-patch 4.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-03-10 01:17:34 Re: Vectored I/O in bulk_write.c
Previous Message Andy Fan 2024-03-09 23:16:40 Re: Extract numeric filed in JSONB more effectively