Re: WAL logging problem in 9.4.3?

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: michael(dot)paquier(at)gmail(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, alvherre(at)2ndquadrant(dot)com, david(at)pgmasters(dot)net, hlinnaka(at)iki(dot)fi, simon(at)2ndquadrant(dot)com, andres(at)anarazel(dot)de, masao(dot)fujii(at)gmail(dot)com, kleptog(at)svana(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WAL logging problem in 9.4.3?
Date: 2017-04-13 09:42:19
Message-ID: 20170413.184219.106482305.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Thu, 13 Apr 2017 13:52:40 +0900, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote in <CAB7nPqTRyica1d-zU+YckveFC876=Sc847etmk7TRgAS2pA9CA(at)mail(dot)gmail(dot)com>
> On Tue, Apr 11, 2017 at 5:38 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> > Sorry, what I have just sent was broken.
>
> You can use PROVE_TESTS when running make check to select a subset of
> tests you want to run. I use that all the time when working on patches
> dedicated to certain code paths.

Thank you for the information. Removing unwanted test scripts
from t/ directories was annoyance. This makes me happy.

> >> - Relation has new members no_pending_sync and pending_sync that
> >> works as instant cache of an entry in pendingSync hash.
> >> - Commit-time synchronizing is restored as Michael's patch.
> >> - If relfilenode is replaced, pending_sync for the old node is
> >> removed. Anyway this is ignored on abort and meaningless on
> >> commit.
> >> - TAP test is renamed to 012 since some new files have been added.
> >>
> >> Accessing pending sync hash occurred on every calling of
> >> HeapNeedsWAL() (per insertion/update/freeze of a tuple) if any of
> >> accessing relations has pending sync. Almost of them are
> >> eliminated as the result.
>
> Did you actually test this patch? One of the logs added makes the
> tests a long time to run:

Maybe this patch requires make clean since it extends the
structure RelationData. (Perhaps I saw the same trouble.)

> 2017-04-13 12:11:27.065 JST [85441] t/102_vacuumdb_stages.pl
> STATEMENT: ANALYZE;
> 2017-04-13 12:12:25.766 JST [85492] LOG: BufferNeedsWAL: pendingSyncs
> = 0x0, no_pending_sync = 0
>
> - lsn = XLogInsert(RM_SMGR_ID,
> - XLOG_SMGR_TRUNCATE | XLR_SPECIAL_REL_UPDATE);
> + rel->no_pending_sync= false;
> + rel->pending_sync = pending;
> + }
>
> It seems to me that those flags and the pending_sync data should be
> kept in the context of backend process and not be part of the Relation
> data...

I understand that the context of "backend process" means
storage.c local. I don't mind the context on which the data is,
but I found only there that can get rid of frequent hash
searching. For pending deletions, just appending to a list is
enough and costs almost nothing, on the other hand pendig syncs
are required to be referenced, sometimes very frequently.

> +void
> +RecordPendingSync(Relation rel)
> I don't think that I agree that this should be part of relcache.c. The
> syncs are tracked should be tracked out of the relation context.

Yeah.. It's in storage.c in the latest patch. (Sorry for the
duplicate name). I think it is a kind of bond between smgr and
relation.

> Seeing how invasive this change is, I would also advocate for this
> patch as only being a HEAD-only change, not many people are
> complaining about this optimization of TRUNCATE missing when wal_level
> = minimal, and this needs a very careful review.

Agreed.

> Should I code something? Or Horiguchi-san, would you take care of it?
> The previous crash I saw has been taken care of, but it's been really
> some time since I looked at this patch...

My point is hash-search on every tuple insertion should be evaded
even if it happens rearely. Once it was a bit apart from your
original patch, but in the latest patch the significant part
(pending-sync hash) is revived from the original one.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2017-04-13 09:44:13 Re: Function to control physical replication slot
Previous Message Amit Langote 2017-04-13 09:36:27 Re: Minor improvement for inval.c header comment