Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?
Date: 2023-02-14 12:30:00
Message-ID: CALj2ACUmvGb78zd49OXcRWvTX1WYHnOcF6oikQiR2i2yH4zvgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 13, 2023 at 11:09 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> > Later code assigns iov[0].iov_len thus we need to provide a separate
> > iov non-const variable, or can we use pwrite instead there? (I didn't
> > find pg_pwrite_with_retry(), though)
>
> Given that we need to do that, and given that we already need to loop to
> handle writes that are longer than PG_IOV_MAX * BLCKSZ, it's probably not
> worth avoiding iov initialization.
>
> But I think it's worth limiting the initialization to blocks.

We can still optimize away the for loop by using a single iovec for
remaining size, like the attached v2 patch.

> I'd also try to combine the first pg_writev_* with the second one.

Done, PSA v2 patch.

On Tue, Feb 14, 2023 at 6:40 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
>
> Hi,
>
> On 2023-02-12 09:31:36 -0800, Andres Freund wrote:
> > Another thing: I think we should either avoid iterating over all the IOVs if
> > we don't need them, or, even better, initialize the array as a constant, once.
>
> I just tried to use pg_pwrite_zeros - and couldn't because it doesn't have an
> offset parameter. Huh, what lead to the function being so constrained?

Done, PSA v2 patch.

We could do few more things, but honestly I feel they're unnecessary:
1) An assert-only code that checks if the asked file contents are
zeroed at the end of pg_pwrite_zeros (to be more defensive, but
reading 16MB files and checking if it's zero-filled will surely
slowdown the Assert builds).
2) A small test module passing in a file with the size to write isn't
multiple of block size, meaning, the code we have in the function to
write last remaining bytes (less than BLCKSZ) gets covered which isn't
covered right now -
https://coverage.postgresql.org/src/common/file_utils.c.gcov.html.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v2-0001-Refactor-pg_pwrite_zeros.patch application/x-patch 7.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2023-02-14 12:36:51 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message John Naylor 2023-02-14 11:24:38 Re: [PoC] Improve dead tuple storage for lazy vacuum