Re: GiST VACUUM

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Костя Кузнецов <chapaev28(at)ya(dot)ru>
Subject: Re: GiST VACUUM
Date: 2019-04-05 05:39:19
Message-ID: 4E1402A8-C05E-4087-B419-03FF5DDA4FB2@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi!

> 4 апр. 2019 г., в 20:15, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> написал(а):
>
> On 25/03/2019 15:20, Heikki Linnakangas wrote:
>> On 24/03/2019 18:50, Andrey Borodin wrote:
>>> I was working on new version of gist check in amcheck and understand one more thing:
>>>
>>> /* Can this page be recycled yet? */
>>> bool
>>> gistPageRecyclable(Page page)
>>> {
>>> return PageIsNew(page) ||
>>> (GistPageIsDeleted(page) &&
>>> TransactionIdPrecedes(GistPageGetDeleteXid(page), RecentGlobalXmin));
>>> }
>>>
>>> Here RecentGlobalXmin can wraparound and page will become unrecyclable for half of xid cycle. Can we prevent it by resetting PageDeleteXid to InvalidTransactionId before doing RecordFreeIndexPage()?
>>> (Seems like same applies to GIN...)
>> True, and B-tree has the same issue. I thought I saw a comment somewhere
>> in the B-tree code about that earlier, but now I can't find it. I
>> must've imagined it.
>> We could reset it, but that would require dirtying the page. That would
>> be just extra I/O overhead, if the page gets reused before XID
>> wraparound. We could avoid that if we stored the full XID+epoch, not
>> just XID. I think we should do that in GiST, at least, where this is
>> new. In the B-tree, it would require some extra code to deal with
>> backwards-compatibility, but maybe it would be worth it even there.
>
> I suggest that we do the attached. It fixes this for GiST. The patch changes expands the "deletion XID" to 64-bits, and changes where it's stored. Instead of storing it pd_prune_xid, it's stored in the page contents. Luckily, a deleted page has no real content.

So, we store full xid right after page header?
+static inline void
+GistPageSetDeleteXid(Page page, FullTransactionId deletexid)
+{
+ Assert(PageIsEmpty(page));
+ ((PageHeader) page)->pd_lower = MAXALIGN(SizeOfPageHeaderData) + sizeof(FullTransactionId);
+
+ *((FullTransactionId *) PageGetContents(page)) = deletexid;
+}

Usually we leave one ItemId (located at invalid offset number) untouched. I do not know is it done for a reason or not....

Also, I did not understand this optimization:
+ /*
+ * We can skip this if the page was deleted so long ago, that no scan can possibly
+ * still see it, even in a standby. One measure might be anything older than the
+ * table's frozen-xid, but we don't have that at hand here. But anything older than
+ * 2 billion, from the next XID, is surely old enough, because you would hit XID
+ * wraparound at that point.
+ */
+ nextxid = ReadNextFullTransactionId();
+ diff = U64FromFullTransactionId(nextxid) - U64FromFullTransactionId(latestRemovedXid);
+ if (diff < 0x7fffffff)
+ return;

Standby can be lagging months from primary, and, theoretically, close the gap in one sudden WAL leap... Also, I think, that comparison sign should be >, not <.

Best regards, Andrey Borodin.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-04-05 05:42:41 Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits
Previous Message Darafei Komяpa Praliaskouski 2019-04-05 05:38:34 Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits