Re: WAL logging problem in 9.4.3?

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WAL logging problem in 9.4.3?
Date: 2016-02-19 07:33:21
Message-ID: CAB7nPqTAP6_8c0nf8m=OTw16TsB7waE3Zz2HsxPuoercnC2qxA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 18, 2016 at 4:27 PM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Thu, Feb 4, 2016 at 3:24 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>> I dropped the ball on this one back in July, so here's an attempt to revive
>> this thread.
>>
>> I spent some time fixing the remaining issues with the prototype patch I
>> posted earlier, and rebased that on top of current git master. See attached.
>>
>> Some review of that would be nice. If there are no major issues with it, I'm
>> going to create backpatchable versions of this for 9.4 and below.
>
> I am going to look into that very soon. For now and to not forget
> about this bug, I have added an entry in the CF app:
> https://commitfest.postgresql.org/9/528/

Worth noting that this patch does not address the problem with index
relations when a TRUNCATE is used in the same transaction as its
CREATE TABLE, take that for example when wal_level = minimal:
1) Run transaction
=# begin;
BEGIN
=# create table ab (a int primary key);
CREATE TABLE
=# truncate ab;
TRUNCATE TABLE
=# commit;
COMMIT
2) Restart server with immediate mode.
3) Failure:
=# table ab;
ERROR: XX001: could not read block 0 in file "base/16384/16388": read
only 0 of 8192 bytes
LOCATION: mdread, md.c:728

The case where a COPY is issued after TRUNCATE is fixed though, so
that's still an improvement.

Here are other comments.

+ /* Flush updates to relations that we didn't WAL-logged */
+ smgrDoPendingSyncs(true);
"Flush updates to relations there were not WAL-logged"?

+void
+FlushRelationBuffersWithoutRelCache(RelFileNode rnode, bool islocal)
+{
+ FlushRelationBuffers_common(smgropen(rnode, InvalidBackendId), islocal);
+}
islocal is always set as false, I'd rather remove it this argument
from FlushRelationBuffersWithoutRelCache.

for (i = 0; i < nrels; i++)
+ {
smgrclose(srels[i]);
+ }
Looks like noise.

+ if (!found)
+ {
+ pending->truncated_to = InvalidBlockNumber;
+ pending->sync_above = nblocks;
+
+ elog(DEBUG2, "registering new pending sync for rel %u/%u/%u at
block %u",
+ rnode.spcNode, rnode.dbNode, rnode.relNode, nblocks);
+
+ }
+ else if (pending->sync_above == InvalidBlockNumber)
+ {
+ elog(DEBUG2, "registering pending sync for rel %u/%u/%u at block %u",
+ rnode.spcNode, rnode.dbNode, rnode.relNode, nblocks);
+ pending->sync_above = nblocks;
+ }
+ else
Here couldn't it be possible that when (sync_above !=
InvalidBlockNumber), nblocks can be higher than sync_above? In which
case we had better increase sync_above to nblocks, no?

+ if (!pendingSyncs)
+ createPendingSyncsHash();
+ pending = (PendingRelSync *) hash_search(pendingSyncs,
+ (void *) &rel->rd_node,
+ HASH_ENTER, &found);
This is lacking comments.

- if (XLogHintBitIsNeeded() && (bufHdr->flags & BM_PERMANENT))
+ BufferGetTag(buffer, &rnode, &forknum, &blknum);
+ if (XLogHintBitIsNeeded() && (bufHdr->flags & BM_PERMANENT) &&
+ !smgrIsSyncPending(rnode, blknum))
Here as well explaining in more details why the buffer does not need
to go through XLogSaveBufferForHint would be nice.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-02-19 07:57:01 Re: MinGW / Windows / printf format specifiers
Previous Message Amit Kapila 2016-02-19 05:20:25 Re: Typo in bufmgr.c that result in waste of memory