Re: Reviewing freeze map code

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reviewing freeze map code
Date: 2016-06-01 07:50:09
Message-ID: CAD21AoBgUSo4KBzb-zotnWPJXPyaqQE4v8jfY4cwKkcaw9h52w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, May 7, 2016 at 5:34 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, May 2, 2016 at 8:25 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> + * heap_tuple_needs_eventual_freeze
>> + *
>> + * Check to see whether any of the XID fields of a tuple (xmin, xmax, xvac)
>> + * will eventually require freezing. Similar to heap_tuple_needs_freeze,
>> + * but there's no cutoff, since we're trying to figure out whether freezing
>> + * will ever be needed, not whether it's needed now.
>> + */
>> +bool
>> +heap_tuple_needs_eventual_freeze(HeapTupleHeader tuple)
>>
>> Wouldn't redefining this to heap_tuple_is_frozen() and then inverting the
>> checks be easier to understand?
>
> I thought it much safer to keep this as close to a copy of
> heap_tuple_needs_freeze() as possible. Copying a function and
> inverting all of the return values is much more likely to introduce
> bugs, IME.

I agree.

>> + /*
>> + * If xmax is a valid xact or multixact, this tuple is also not frozen.
>> + */
>> + if (tuple->t_infomask & HEAP_XMAX_IS_MULTI)
>> + {
>> + MultiXactId multi;
>> +
>> + multi = HeapTupleHeaderGetRawXmax(tuple);
>> + if (MultiXactIdIsValid(multi))
>> + return true;
>> + }
>>
>> Hm. What's the test inside the if() for? There shouldn't be any case
>> where xmax is invalid if HEAP_XMAX_IS_MULTI is set. Now there's a
>> check like that outside of this commit, but it seems strange to me
>> (Alvaro, perhaps you could comment on this?).
>
> Here again I was copying existing code, with appropriate simplifications.
>
>> + *
>> + * Clearing both visibility map bits is not separately WAL-logged. The callers
>> * must make sure that whenever a bit is cleared, the bit is cleared on WAL
>> * replay of the updating operation as well.
>>
>> I think including "both" here makes things less clear, because it
>> differentiates clearing one bit from clearing both. There's no practical
>> differentce atm, but still.
>
> I agree.

Fixed.

>> *
>> * VACUUM will normally skip pages for which the visibility map bit is set;
>> * such pages can't contain any dead tuples and therefore don't need vacuuming.
>> - * The visibility map is not used for anti-wraparound vacuums, because
>> - * an anti-wraparound vacuum needs to freeze tuples and observe the latest xid
>> - * present in the table, even on pages that don't have any dead tuples.
>> *
>>
>> I think the remaining sentence isn't entirely accurate, there's now more
>> than one bit, and they're different with regard to scan_all/!scan_all
>> vacuums (or will be - maybe this updated further in a later commit? But
>> if so, that sentence shouldn't yet be removed...).
>
> We can adjust the language, but I don't really see a big problem here.

This comment is not incorporate this patch so far.

>> -/* Number of heap blocks we can represent in one byte. */
>> -#define HEAPBLOCKS_PER_BYTE 8
>> -
>> Hm, why was this moved to the header? Sounds like something the outside
>> shouldn't care about.
>
> Oh... yeah. Let's undo that.

Fixed.

>> #define HEAPBLK_TO_MAPBIT(x) (((x) % HEAPBLOCKS_PER_BYTE) * BITS_PER_HEAPBLOCK)
>>
>> Hm. This isn't really a mapping to an individual bit anymore - but I
>> don't really have a better name in mind. Maybe TO_OFFSET?
>
> Well, it sorta is... but we could change it, I suppose.
>
>> +static const uint8 number_of_ones_for_visible[256] = {
>> ...
>> +};
>> +static const uint8 number_of_ones_for_frozen[256] = {
>> ...
>> };
>>
>> Did somebody verify the new contents are correct?
>
> I admit that I didn't. It seemed like an unlikely place for a goof,
> but I guess we should verify.
>> /*
>> - * visibilitymap_clear - clear a bit in visibility map
>> + * visibilitymap_clear - clear all bits in visibility map
>> *
>>
>> This seems rather easy to misunderstand, as this really only clears all
>> the bits for one page, not actually all the bits.
>
> We could change "in" to "for one page in the".

Fixed.

>> * the bit for heapBlk, or InvalidBuffer. The caller is responsible for
>> - * releasing *buf after it's done testing and setting bits.
>> + * releasing *buf after it's done testing and setting bits, and must pass flags
>> + * for which it needs to check the value in visibility map.
>> *
>> * NOTE: This function is typically called without a lock on the heap page,
>> * so somebody else could change the bit just after we look at it. In fact,
>> @@ -327,17 +351,16 @@ visibilitymap_set(Relation rel, BlockNumber heapBlk, Buffer heapBuf,
>>
>> I'm not seing what flags the above comment change is referring to?
>
> Ugh. I think that's leftover cruft from an earlier patch version that
> should have been excised from what got committed.

Fixed.

>> /*
>> - * A single-bit read is atomic. There could be memory-ordering effects
>> + * A single byte read is atomic. There could be memory-ordering effects
>> * here, but for performance reasons we make it the caller's job to worry
>> * about that.
>> */
>> - result = (map[mapByte] & (1 << mapBit)) ? true : false;
>> -
>> - return result;
>> + return ((map[mapByte] >> mapBit) & VISIBILITYMAP_VALID_BITS);
>> }
>>
>> Not a new issue, and *very* likely to be irrelevant in practice (given
>> the value is only referenced once): But there's really no guarantee
>> map[mapByte] is only read once here.
>
> Meh. But we can fix if you want to.

Fixed.

>> -BlockNumber
>> -visibilitymap_count(Relation rel)
>> +void
>> +visibilitymap_count(Relation rel, BlockNumber *all_visible, BlockNumber *all_frozen)
>>
>> Not really a new issue again: The parameter types (previously return
>> type) to this function seem wrong to me.
>
> Not this patch's job to tinker.

This comment is not incorporate this patch yet.

>> @@ -1934,5 +1992,14 @@ heap_page_is_all_visible(Relation rel, Buffer buf, TransactionId *visibility_cut
>> }
>> + /*
>> + * We don't bother clearing *all_frozen when the page is discovered not
>> + * to be all-visible, so do that now if necessary. The page might fail
>> + * to be all-frozen for other reasons anyway, but if it's not all-visible,
>> + * then it definitely isn't all-frozen.
>> + */
>> + if (!all_visible)
>> + *all_frozen = false;
>> +
>>
>> Why don't we just set *all_frozen to false when appropriate? It'd be
>> just as many lines and probably easier to understand?
>
> I thought that looked really easy to mess up, either now or down the
> road. This way seemed more solid to me. That's a judgement call, of
> course.

To be understanding easier, I changed it so.

>> + /*
>> + * If the page is marked as all-visible but not all-frozen, we should
>> + * so mark it. Note that all_frozen is only valid if all_visible is
>> + * true, so we must check both.
>> + */
>>
>> This kinda seems to imply that all-visible implies all_frozen. Also, why
>> has that block been added to the end of the if/else if chain? Seems like
>> it belongs below the (all_visible && !all_visible_according_to_vm) block.
>
> We can adjust the comment a bit to make it more clear, if you like,
> but I doubt it's going to cause serious misunderstanding. As for the
> placement, the reason I put it at the end is because I figured that we
> did not want to mark it all-frozen if any of the "oh crap, emit a
> warning" cases applied.
>

Fixed comment.
I think that we should care about all-visible problem first, and then
care all-frozen problem.
So this patch doesn't change the placement.

Attached patch fixes only above comments, other are being addressed now.

--
Regards,

--
Masahiko Sawada

Attachment Content-Type Size
fix_freeze_map_a892234_v1.patch text/x-diff 8.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-06-01 08:26:08 Re: BUG #14155: bloom index error with unlogged table
Previous Message Kyotaro HORIGUCHI 2016-06-01 06:09:43 Re: Parallel pg_dump's error reporting doesn't work worth squat