Hot Standby b-tree delete records review

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Hot Standby b-tree delete records review
Date: 2010-04-22 07:24:26
Message-ID: 4BCFF9AA.6000703@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

btree_redo:
> case XLOG_BTREE_DELETE:
>
> /*
> * Btree delete records can conflict with standby queries. You
> * might think that vacuum records would conflict as well, but
> * we've handled that already. XLOG_HEAP2_CLEANUP_INFO records
> * provide the highest xid cleaned by the vacuum of the heap
> * and so we can resolve any conflicts just once when that
> * arrives. After that any we know that no conflicts exist
> * from individual btree vacuum records on that index.
> */
> {
> TransactionId latestRemovedXid = btree_xlog_delete_get_latestRemovedXid(record);
> xl_btree_delete *xlrec = (xl_btree_delete *) XLogRecGetData(record);
>
> /*
> * XXX Currently we put everybody on death row, because
> * currently _bt_delitems() supplies InvalidTransactionId.
> * This can be fairly painful, so providing a better value
> * here is worth some thought and possibly some effort to
> * improve.
> */
> ResolveRecoveryConflictWithSnapshot(latestRemovedXid, xlrec->node);
> }
> break;

The XXX comment is out-of-date, the latestRemovedXid value is calculated
by btree_xlog_delete_get_latestRemovedXid() nowadays.

If we're re-replaying the WAL record, for example after restarting the
standby server, btree_xlog_delete_get_latestRemovedXid() won't find the
deleted records and will return InvalidTransactionId. That's OK, because
until we reach again the point in WAL where we were before the restart,
we don't accept read-only connections so there's no-one to kill anyway,
but you do get a useless "Invalid latestRemovedXid reported, using
latestCompletedXid instead" message in the log (that shouldn't be
capitalized, BTW).

It would be nice to check if there's any potentially conflicting
read-only queries before calling
btree_xlog_delete_get_latestRemovedXid(), which can be quite expensive.

If the "Invalid latestRemovedXid reported, using latestCompletedXid
instead" message is going to happen commonly, I think it should be
downgraded to DEBUG1. If it's an unexpected scenario, it should be
upgraded to WARNING.

In btree_xlog_delete_get_latestRemovedXid:
> Assert(num_unused == 0);

Can't that happen as well in a re-replay scenario, if a heap item was
vacuumed away later on?

> /*
> * Note that if all heap tuples were LP_DEAD then we will be
> * returning InvalidTransactionId here. This seems very unlikely
> * in practice.
> */

If none of the removed heap tuples were present anymore, we currently
return InvalidTransactionId, which kills/waits out all read-only
queries. But if none of the tuples were present anymore, the read-only
queries wouldn't have seen them anyway, so ISTM that we should treat
InvalidTransactionId return value as "we don't need to kill anyone".

Why does btree_xlog_delete_get_latestRemovedXid() keep the
num_unused/num_dead/num_redirect counts, it doesn't actually do anything
with them.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-04-22 07:53:50 Re: Assertion failure twophase.c (3) (testing HS/SR)
Previous Message Simon Riggs 2010-04-22 06:56:39 Re: testing HS/SR - 1 vs 2 performance