Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?
Date: 2022-08-07 05:11:49
Message-ID: CALj2ACUpU2jHBS62wBN0jvBHjd9TaAMd1aC-m0tx3s4qUZsBKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 7, 2022 at 7:43 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
>
> On Sun, Aug 7, 2022 at 1:12 PM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> > On Sat, Aug 6, 2022 at 12:11 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > Yeah. pg_pwritev_with_retry can also be part of common/file_utils.c/.h
> > so that everyone can use it.
> >
> > > > Thoughts?
> > >
> > > Makes sense to me for the WAL segment pre-padding initialization, as
> > > we still want to point to the beginning of the segment after we are
> > > done with the pre-padding, and the code has an extra lseek().
> >
> > Thanks. I attached the v1 patch, please review it.
>
> Hi Bharath,
>
> +1 for moving pg_pwritev_with_retry() to file_utils.c, but I think the
> commit message should say that is happening. Maybe the move should
> even be in a separate patch (IMHO it's nice to separate mechanical
> change patches from new logic/work patches).

Agree. I separated out the changes.

> +/*
> + * A convenience wrapper for pg_pwritev_with_retry() that zero-fills the given
> + * file of size total_sz in batches of size block_sz.
> + */
> +ssize_t
> +pg_pwritev_with_retry_and_init(int fd, int total_sz, int block_sz)
>
> Hmm, why not give it a proper name that says it writes zeroes?

Done.

> Size arguments around syscall-like facilities are usually size_t.
>
> + memset(zbuffer.data, 0, block_sz);
>
> I believe the modern fashion as of a couple of weeks ago is to tell
> the compiler to zero-initialise. Since it's a union you'd need
> designated initialiser syntax, something like zbuffer = { .data = {0}
> }?

Hm, but we have many places still using memset(). If we were to change
these syntaxes, IMO, it must be done separately.

> + iov[i].iov_base = zbuffer.data;
> + iov[i].iov_len = block_sz;
>
> How can it be OK to use caller supplied block_sz, when
> sizeof(zbuffer.data) is statically determined? What is the point of
> this argument?

Yes, removed block_sz function parameter.

> - dir_data->lasterrno = errno;
> + /* If errno isn't set, assume problem is no disk space */
> + dir_data->lasterrno = errno ? errno : ENOSPC;
>
> This coding pattern is used in places where we blame short writes on
> lack of disk space without bothering to retry. But the code used in
> this patch already handles short writes: it always retries, until
> eventually, if you really are out of disk space, you should get an
> actual ENOSPC from the operating system. So I don't think the
> guess-it-must-be-ENOSPC technique applies here.

Done.

Thanks for reviewing. PSA v2 patch-set. Also,I added a CF entry
https://commitfest.postgresql.org/39/3803/ to give the patches a
chance to get tested.

--
Bharath Rupireddy
RDS Open Source Databases: https://aws.amazon.com/rds/postgresql/

Attachment Content-Type Size
v2-0001-Move-pg_pwritev_with_retry-to-file_utils.c.patch application/x-patch 5.8 KB
v2-0002-Use-pg_pwritev_with_retry-instead-of-write-in-wal.patch application/x-patch 5.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-08-07 05:17:22 Re: failing to build preproc.c on solaris with sun studio
Previous Message Andres Freund 2022-08-07 04:17:09 Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints