Re: WAL logging problem in 9.4.3?

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: michael(dot)paquier(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, alvherre(at)2ndquadrant(dot)com, david(at)pgmasters(dot)net, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, simon(at)2ndquadrant(dot)com, andres(at)anarazel(dot)de, masao(dot)fujii(at)gmail(dot)com, kleptog(at)svana(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: WAL logging problem in 9.4.3?
Date: 2017-09-05 10:05:01
Message-ID: B3EC34FC-A48E-41AA-8598-BFC5D87CB383@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 13 Apr 2017, at 11:42, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
> At Thu, 13 Apr 2017 13:52:40 +0900, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> wrote in <CAB7nPqTRyica1d-zU+YckveFC876=Sc847etmk7TRgAS2pA9CA(at)mail(dot)gmail(dot)com>
>> On Tue, Apr 11, 2017 at 5:38 PM, Kyotaro HORIGUCHI
>> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>> Sorry, what I have just sent was broken.
>>
>> You can use PROVE_TESTS when running make check to select a subset of
>> tests you want to run. I use that all the time when working on patches
>> dedicated to certain code paths.
>
> Thank you for the information. Removing unwanted test scripts
> from t/ directories was annoyance. This makes me happy.
>
>>>> - Relation has new members no_pending_sync and pending_sync that
>>>> works as instant cache of an entry in pendingSync hash.
>>>> - Commit-time synchronizing is restored as Michael's patch.
>>>> - If relfilenode is replaced, pending_sync for the old node is
>>>> removed. Anyway this is ignored on abort and meaningless on
>>>> commit.
>>>> - TAP test is renamed to 012 since some new files have been added.
>>>>
>>>> Accessing pending sync hash occurred on every calling of
>>>> HeapNeedsWAL() (per insertion/update/freeze of a tuple) if any of
>>>> accessing relations has pending sync. Almost of them are
>>>> eliminated as the result.
>>
>> Did you actually test this patch? One of the logs added makes the
>> tests a long time to run:
>
> Maybe this patch requires make clean since it extends the
> structure RelationData. (Perhaps I saw the same trouble.)
>
>> 2017-04-13 12:11:27.065 JST [85441] t/102_vacuumdb_stages.pl
>> STATEMENT: ANALYZE;
>> 2017-04-13 12:12:25.766 JST [85492] LOG: BufferNeedsWAL: pendingSyncs
>> = 0x0, no_pending_sync = 0
>>
>> - lsn = XLogInsert(RM_SMGR_ID,
>> - XLOG_SMGR_TRUNCATE | XLR_SPECIAL_REL_UPDATE);
>> + rel->no_pending_sync= false;
>> + rel->pending_sync = pending;
>> + }
>>
>> It seems to me that those flags and the pending_sync data should be
>> kept in the context of backend process and not be part of the Relation
>> data...
>
> I understand that the context of "backend process" means
> storage.c local. I don't mind the context on which the data is,
> but I found only there that can get rid of frequent hash
> searching. For pending deletions, just appending to a list is
> enough and costs almost nothing, on the other hand pendig syncs
> are required to be referenced, sometimes very frequently.
>
>> +void
>> +RecordPendingSync(Relation rel)
>> I don't think that I agree that this should be part of relcache.c. The
>> syncs are tracked should be tracked out of the relation context.
>
> Yeah.. It's in storage.c in the latest patch. (Sorry for the
> duplicate name). I think it is a kind of bond between smgr and
> relation.
>
>> Seeing how invasive this change is, I would also advocate for this
>> patch as only being a HEAD-only change, not many people are
>> complaining about this optimization of TRUNCATE missing when wal_level
>> = minimal, and this needs a very careful review.
>
> Agreed.
>
>> Should I code something? Or Horiguchi-san, would you take care of it?
>> The previous crash I saw has been taken care of, but it's been really
>> some time since I looked at this patch...
>
> My point is hash-search on every tuple insertion should be evaded
> even if it happens rearely. Once it was a bit apart from your
> original patch, but in the latest patch the significant part
> (pending-sync hash) is revived from the original one.

This patch has followed along since CF 2016-03, do we think we can reach a
conclusion in this CF? It was marked as "Waiting on Author”, based on
developments since in this thread, I’ve changed it back to “Needs Review”
again.

cheers ./daniel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2017-09-05 10:15:52 Re: [PROPOSAL] Use SnapshotAny in get_actual_variable_range
Previous Message Nikolay Shaplov 2017-09-05 09:55:40 Re: [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM