Re: PG 13 release notes, first draft

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bruce(at)momjian(dot)us
Cc: noah(at)leadboat(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: PG 13 release notes, first draft
Date: 2020-05-13 02:56:33
Message-ID: 20200513.115633.157251733478541458.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 12 May 2020 16:38:09 -0400, Bruce Momjian <bruce(at)momjian(dot)us> wrote in
> On Tue, May 12, 2020 at 01:09:08PM +0900, Kyotaro Horiguchi wrote:
> > > > commit c6b9204
> > > > Author: Noah Misch <noah(at)leadboat(dot)com>
> > > > AuthorDate: Sat Apr 4 12:25:34 2020 -0700
> > > > Commit: Noah Misch <noah(at)leadboat(dot)com>
> > > > CommitDate: Sat Apr 4 12:25:34 2020 -0700
> > > >
> > > > Skip WAL for new relfilenodes, under wal_level=minimal.
> > > >
> > > > Until now, only selected bulk operations (e.g. COPY) did this. If a
> > > > given relfilenode received both a WAL-skipping COPY and a WAL-logged
> > > > operation (e.g. INSERT), recovery could lose tuples from the COPY. See
> > > > src/backend/access/transam/README section "Skipping WAL for New
> > > > RelFileNode" for the new coding rules. Maintainers of table access
> > > > methods should examine that section.
> > >
> > > OK, so how do we want to document this? Do I mention in the text below
> > > the WAL skipping item that this fixes a bug where a mix of simultaneous
> > > COPY and INSERT into a table could lose rows during crash recovery, or
> > > create a new item?
> >
> > FWIW, as dicussed upthread, I suppose that the API change is not going
> > to be in relnotes.
> >
> > something like this?
> >
> > - Fix bug of WAL-skipping optimiazation
> >
> > Previously a trasaction doing both of COPY and a WAL-logged operations
> > like INSERT while wal_level=minimal can lead to loss of COPY'ed rows
> > through crash recovery. Also this fix extends the WAL-skipping
> > optimiazation to all kinds of bulk insert operations.
>
> Uh, that kind of mixes the bug fix and the feature in a way that it is
> hard to understand. How about this?
>
> Allow skipping of WAL for new tables and indexes if wal_level is
> 'minimal' (Kyotaro Horiguchi)
>
> Relations larger than wal_skip_threshold will have their files
> fsync'ed rather than writing their WAL records. Previously this
> was done only for COPY operations, but the implementation had a
> bug that could cause data loss during crash recovery.

I see it. It is giving weight on improvement. Looks good the overall
structure of the description above. However, wal-skipping is always
done regardless of table size. wal_skip_threshold is an optimization
to choose which to use fsync or FPI records (that is, not WAL records
in the common sense) at commit for speed.

So how about the following?

All kinds of bulk-insertion are not WAL-logged then fsync'ed at
commit. Using FPI WAL records instead of fsync for relations smaller
than wal_skip_threshold. Previously this was done only for COPY
operations and always using fsync, but the implementation had a bug
that could cause data loss during crash recovery.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2020-05-13 03:05:51 Re: new heapcheck contrib module
Previous Message Tom Lane 2020-05-13 02:54:48 Re: Our naming of wait events is a disaster.