Re: [HACKERS] WAL logging problem in 9.4.3?

From: Noah Misch <noah(at)leadboat(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, 9erthalion6(at)gmail(dot)com, andrew(dot)dunstan(at)2ndquadrant(dot)com, hlinnaka(at)iki(dot)fi, robertmhaas(at)gmail(dot)com, michael(at)paquier(dot)xyz
Subject: Re: [HACKERS] WAL logging problem in 9.4.3?
Date: 2019-05-17 06:50:50
Message-ID: 20190517065050.GA1298884@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 14, 2019 at 01:59:10PM +0900, Kyotaro HORIGUCHI wrote:
> At Sun, 12 May 2019 17:37:05 -0700, Noah Misch <noah(at)leadboat(dot)com> wrote in <20190513003705(dot)GA1202614(at)rfd(dot)leadboat(dot)com>
> > On Sun, Mar 31, 2019 at 03:31:58PM -0700, Noah Misch wrote:
> > > On Sun, Mar 10, 2019 at 07:27:08PM -0700, Noah Misch wrote:
> > > > I also liked the design in the https://postgr.es/m/559FA0BA.3080808@iki.fi
> > > > last paragraph, and I suspect it would have been no harder to back-patch. I
> > > > wonder if it would have been simpler and better, but I'm not asking anyone to
> > > > investigate that.
> > >
> > > Now I am asking for that. Would anyone like to try implementing that other
> > > design, to see how much simpler it would be?
>
> Yeah, I think it is a bit too-complex for the value. But I think
> it is the best way as far as we keep reusing a file on
> truncation of the whole file.

The design of v11-0006-Fix-WAL-skipping-feature.patch doesn't, in general,
work for WAL records touching more than one buffer. For heapam, that patch
works around this problem by emitting XLOG_HEAP_INSERT or XLOG_HEAP_DELETE
when we'd normally emit XLOG_HEAP_UPDATE. As a result, post-crash-recovery
heap page bits differ from the bits present when we don't crash. Though I'm
85% confident this does not introduce a bug today, this is fragile. That is
the main complexity I wish to avoid.

I suspect the design in the https://postgr.es/m/559FA0BA.3080808@iki.fi last
paragraph will be simpler, not more complex. In the implementation I'm
envisioning, smgrDoPendingDeletes() would change name, perhaps to
AtEOXact_Storage(). For every relfilenode it does not delete, it would ensure
durability by syncing (for large nodes) or by WAL-logging each page (for small
nodes). RelationNeedsWAL() would return false whenever the applicable
relfilenode appears in pendingDeletes. Access methods would remove their
smgrimmedsync() calls, but they would otherwise not change. Would anyone like
to try implementing that?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Lætitia Avrot 2019-05-17 07:16:04 Re: [Doc] pg_restore documentation didn't explain how to use connection string
Previous Message Kyotaro HORIGUCHI 2019-05-17 06:47:20 Re: shared-memory based stats collector