Re: Enabling Checksums

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Enabling Checksums
Date: 2013-01-10 06:06:40
Message-ID: 1357798000.5162.38.camel@jdavis-laptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2012-12-04 at 01:03 -0800, Jeff Davis wrote:
> For now, I rebased the patches against master, and did some very minor
> cleanup.

I think there is a problem here when setting PD_ALL_VISIBLE. I thought I
had analyzed that before, but upon review, it doesn't look right.
Setting PD_ALL_VISIBLE needs to be associated with a WAL action somehow,
and a bumping of the LSN, otherwise there is a torn page hazard.

The solution doesn't seem particularly difficult, but there are a few
strange aspects and I'm not sure exactly which path I should take.

First of all, the relationship between MarkBufferDirty and
SetBufferCommitInfoNeedsSave is a little confusing. The comment over
MarkBufferDirty is confusing because it says that the caller must have
an exclusive lock, or else bad data could be written. But that doesn't
have to do with marking the buffer dirty, that has to do with the data
page change you make while you are marking it dirty -- if it's a single
bit change, then there is no risk that I can see.

In the current code, the only real code difference between the two is
that SetBufferCommitInfoNeedsSave might fail to mark the buffer dirty if
there is a race. So, in the current code, we could actually combine the
two by passing a "force" flag (if true, behaves like MarkBufferDirty, if
false, behaves like SetBufferCommitInfoNeedsSave).

The checksums patch also introduces another behavior into
SetBufferCommitInfoNeedsSave, which is to write an XLOG_HINT WAL record
if checksums are enabled (to avoid torn page hazards). That's only
necessary for changes where the caller does not write WAL itself and
doesn't bump the LSN of the data page. (There's a reason the caller
can't easily write the XLOG_HINT WAL itself.) So, we could introduce
another flag "needsWAL" that would control whether we write the
XLOG_HINT WAL or not (only applies with checksums on, of course).

The reason for all of this is because the setting of PD_ALL_VISIBLE does
not fit MarkBufferDirty, because MarkBufferDirty does not write the
XLOG_HINT WAL and neither does the caller. But it also doesn't fit
SetBufferCommitInfoNeedsSave, because that is subject to a race. If
MarkBufferDirty had the signature:

MarkBufferDirty(Buffer buffer, bool force, bool needsWAL)

then "normal" page changes would look like:
MarkBufferDirty(buffer, true, false)
setting PD_ALL_VISIBLE would look like:
MarkBufferDirty(buffer, true, true)
and setting a hint would look like:
MarkBufferDirty(buffer, false, true)

Another approach would be for the caller who sets PD_ALL_VISIBLE to
write WAL. But that requires inventing a new WAL record or chaining the
heap block onto the wal entry when doing visibilitymap_set (only
necessary when checksums are on). That seems somewhat of a hack, but
perhaps it's not too bad.

Also, I have another patch posted that is removing PD_ALL_VISIBLE
entirely, which is dampening my enthusiasm to do too much work that
might be thrown away. So right now, I'm leaning toward just adding the
heap buffer to the WAL chain during visibilitymap_set.

Thoughts?

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2013-01-10 06:15:47 Re: 9.2.1 & index-only scans : abnormal heap fetches after VACUUM FULL
Previous Message Tom Lane 2013-01-10 05:14:46 Re: Lock levels for ALTER TABLE