Re: [BUGS] BUG #6425: Bus error in slot_deform_tuple

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUGS] BUG #6425: Bus error in slot_deform_tuple
Date: 2012-02-04 18:37:40
Message-ID: CA+U5nMJ8SjFkkwZ5d3dRORSrgykL3Fwd-1DKcZu25RRi4GXkQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Fri, Feb 3, 2012 at 6:45 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> The reason it is relevant
> to our current problem is that even though RestoreBkpBlocks faithfully
> takes exclusive lock on the buffer, *that is not enough to guarantee
> that no one else is touching that buffer*.  Another backend that has
> already located a visible tuple on a page is entitled to keep accessing
> that tuple with only a buffer pin.  So the existing code transiently
> wipes the data from underneath the other backend's pin.

While deciding whether to apply the patch, I'm thinking about whether
we should be doing this at all. We already agreed that backup blocks
were removable from the WAL stream.

The cause here is data changing underneath the user. Your patch solves
the most obvious error, but it still allows other problems if applying
the backup block changes data. If the backup block doesn't do anything
at all then we don't need to apply it either.

So ISTM that we should just skip applying backup blocks over the top
of existing buffers once we have reached consistency.

Patch to do that attached, but the basic part of it is just this...

@@ -3700,8 +3701,21 @@ RestoreBkpBlocks(XLogRecPtr lsn, XLogRecord
*record, bool cleanup)
memcpy(&bkpb, blk, sizeof(BkpBlock));
blk += sizeof(BkpBlock);

+ hit = false;
buffer = XLogReadBufferExtended(bkpb.node, bkpb.fork,
bkpb.block,
-
RBM_ZERO);
+
RBM_ZERO, &hit);
+
+ /*
+ * If we found the block in shared buffers and we are already
+ * consistent then skip applying the backup block. The block
+ * was already removable anyway, so we can skip
without problems.
+ * This avoids us needing to take a cleanup lock in
all cases when
+ * we apply backup blocks because of potential effects
on user queries,
+ * which expect data on blocks to remain constant
while being read.
+ */
+ if (reachedConsistency && hit)
+ continue;
+
Assert(BufferIsValid(buffer));
if (cleanup)
LockBufferForCleanup(buffer);

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Simon Riggs 2012-02-04 18:39:19 Re: [BUGS] BUG #6425: Bus error in slot_deform_tuple
Previous Message Simon Riggs 2012-02-04 16:11:43 Re: BUG #6425: Bus error in slot_deform_tuple

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2012-02-04 18:39:19 Re: [BUGS] BUG #6425: Bus error in slot_deform_tuple
Previous Message Jeff Janes 2012-02-04 18:12:16 Re: Memory usage during sorting