| From: | Andres Freund <andres(at)anarazel(dot)de> | 
|---|---|
| To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> | 
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Reviewing freeze map code | 
| Date: | 2016-07-14 06:06:07 | 
| Message-ID: | 20160714060607.klwgq2qr7egt3zrr@alap3.anarazel.de | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi,
So I'm generally happy with 0001, baring some relatively minor
adjustments. I am however wondering about one thing:
On 2016-07-11 23:51:05 +0900, Masahiko Sawada wrote:
> diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> index 57da57a..e7cb8ca 100644
> --- a/src/backend/access/heap/heapam.c
> +++ b/src/backend/access/heap/heapam.c
> @@ -3923,6 +3923,16 @@ l2:
>
>  	if (need_toast || newtupsize > pagefree)
>  	{
> +		/*
> +		 * For crash safety, we need to emit that xmax of old tuple is set
> +		 * and clear only the all-frozen bit on visibility map if needed
> +		 * before releasing the buffer. We can reuse xl_heap_lock for this
> +		 * purpose. It should be fine even if we crash midway from this
> +		 * section and the actual updating one later, since the xmax will
> +		 * appear to come from an aborted xid.
> +		 */
> +		START_CRIT_SECTION();
> +
>  		/* Clear obsolete visibility flags ... */
>  		oldtup.t_data->t_infomask &= ~(HEAP_XMAX_BITS | HEAP_MOVED);
>  		oldtup.t_data->t_infomask2 &= ~HEAP_KEYS_UPDATED;
> @@ -3936,6 +3946,28 @@ l2:
>  		/* temporarily make it look not-updated */
>  		oldtup.t_data->t_ctid = oldtup.t_self;
>  		already_marked = true;
> +
> +		MarkBufferDirty(buffer);
> +
> +		if (RelationNeedsWAL(relation))
> +		{
> +			xl_heap_lock xlrec;
> +			XLogRecPtr recptr;
> +
> +			XLogBeginInsert();
> +			XLogRegisterBuffer(0, buffer, REGBUF_STANDARD);
> +
> +			xlrec.offnum = ItemPointerGetOffsetNumber(&oldtup.t_self);
> +			xlrec.locking_xid = xmax_old_tuple;
> +			xlrec.infobits_set = compute_infobits(oldtup.t_data->t_infomask,
> +												  oldtup.t_data->t_infomask2);
> +			XLogRegisterData((char *) &xlrec, SizeOfHeapLock);
> +			recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_LOCK);
> +			PageSetLSN(page, recptr);
> +		}
Master does
		/* temporarily make it look not-updated */
		oldtup.t_data->t_ctid = oldtup.t_self;
here, and as is the wal record won't reflect that, because:
static void
heap_xlog_lock(XLogReaderState *record)
{
...
		/*
		 * Clear relevant update flags, but only if the modified infomask says
		 * there's no update.
		 */
		if (HEAP_XMAX_IS_LOCKED_ONLY(htup->t_infomask))
		{
			HeapTupleHeaderClearHotUpdated(htup);
			/* Make sure there is no forward chain link in t_ctid */
			ItemPointerSet(&htup->t_ctid,
						   BufferGetBlockNumber(buffer),
						   offnum);
		}
won't enter the branch, because HEAP_XMAX_LOCK_ONLY won't be set.  Which
will leave t_ctid and HEAP_HOT_UPDATED set differently on the master and
standby / after crash recovery.   I'm failing to see any harmful
consequences right now, but differences between master and standby are a bad
thing. Pre 9.3 that's not a problem, we reset ctid and HOT_UPDATED
unconditionally there.   I think I'm more comfortable with setting
HEAP_XMAX_LOCK_ONLY until the tuple is finally updated - that also
coincides more closely with the actual meaning.
Any arguments against?
>
> +		/* Clear only the all-frozen bit on visibility map if needed */
> +		if (PageIsAllVisible(BufferGetPage(buffer)) &&
> +			VM_ALL_FROZEN(relation, block, &vmbuffer))
> +		{
> +			visibilitymap_clear_extended(relation, block, vmbuffer,
> +										 VISIBILITYMAP_ALL_FROZEN);
> +		}
> +
FWIW, I don't think it's worth introducing visibilitymap_clear_extended.
As this is a 9.6 only patch, i think it's better to change
visibilitymap_clear's API.
Unless somebody protests I'm planning to commit with those adjustments
tomorrow.
Greetings,
Andres Freund
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Craig Ringer | 2016-07-14 06:06:31 | Re: A Modest Upgrade Proposal | 
| Previous Message | Craig Ringer | 2016-07-14 05:48:37 | Re: One process per session lack of sharing |