Re: [HACKERS] WAL logging problem in 9.4.3?

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: noah(at)leadboat(dot)com
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org, 9erthalion6(at)gmail(dot)com, andrew(dot)dunstan(at)2ndquadrant(dot)com, hlinnaka(at)iki(dot)fi, michael(at)paquier(dot)xyz
Subject: Re: [HACKERS] WAL logging problem in 9.4.3?
Date: 2019-11-26 12:37:52
Message-ID: 20191126.213752.2132434859202124793.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Sun, 24 Nov 2019 22:08:39 -0500, Noah Misch <noah(at)leadboat(dot)com> wrote in
> On Mon, Nov 25, 2019 at 11:08:54AM +0900, Kyotaro Horiguchi wrote:
> > At Sat, 23 Nov 2019 16:21:36 -0500, Noah Misch <noah(at)leadboat(dot)com> wrote in
> > > I noticed an additional defect:
> > >
> > > BEGIN;
> > > CREATE TABLE t (c) AS SELECT 1;
> > > CHECKPOINT; -- write and fsync the table's one page
> > > TRUNCATE t; -- no WAL
> > > COMMIT; -- no FPI, just the commit record
> > >
> > > If we crash after the COMMIT and before the next fsync or OS-elected sync of
> > > the table's file, the table will stay on disk with its pre-TRUNCATE content.
> >
> > The TRUNCATE replaces relfilenode in the catalog
>
> No, it does not. Since the relation is new in the transaction, the TRUNCATE
> uses the heap_truncate_one_rel() strategy.
..
> The zero-pages case is not special. Here's an example of the problem with a
> nonzero size:

I got it. That is, if the file has had blocks beyond the size at
commit, we should sync the file even if it is small enough. It nees to
track beore-trunction size as this patch used to have.

pendingSyncHash is resurrected to do truncate-size tracking. That
information cannot be stored in SMgrRelation, which will be dissapper
by invalidation, or Relation, which is not available in storage layer.
smgrDoPendingDeletes is needed to be called at aboft again to clean up
useless hash. I'm not sure the exact cause but
AssertPendingSyncs_RelationCache() fails at abort (so it is not called
at abort).

smgrDoPendingSyncs and RelFileNodeSkippingWAL() become simpler by
using the hash.

Is is not fully checked. I didn't merged and mesured performance yet,
but I post the status-quo patch for now.

- v25-0001-version-nm.patch

Noah's v24 patch.

- v25-0002-Revert-FlushRelationBuffersWithoutRelcache.patch

Remove useless function (added by this patch..).

- v25-0003-Improve-the-performance-of-relation-syncs.patch

Make smgrDoPendingSyncs scan shared buffer once.

v25-0004-Adjust-gistGetFakeLSN.patch

Amendment for gistGetFakeLSN. This uses GetXLogInsertRecPtr as long as
it is different from the previous call and emits dummy WAL if we need
a new LSN. Since other than switch_wal record cannot be empty so the
dummy WAL has an integer content for now.

v25-0005-Sync-files-shrinked-by-truncation.patch

Amendment for the truncation problem.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v25-0001-version-nm.patch text/x-patch 70.5 KB
v25-0002-Revert-FlushRelationBuffersWithoutRelcache.patch text/x-patch 3.3 KB
v25-0003-Improve-the-performance-of-relation-syncs.patch text/x-patch 8.6 KB
v25-0004-Adjust-gistGetFakeLSN.patch text/x-patch 5.1 KB
v25-0005-Sync-files-shrinked-by-truncation.patch text/x-patch 10.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2019-11-26 13:45:10 RE: [PATCH] Remove twice assignment with var pageop (nbtree.c).
Previous Message Amit Kapila 2019-11-26 12:34:47 Re: [HACKERS] Block level parallel vacuum