I completely agree with all of the above. Therefore, the proposed mechanism may entail larger improvements (and not only VACUUM).

I can offer the following solution.

For VACUUM, create a hash table.

VACUUM scanning the table sees that the version (tuple1) has t_ctid filled and refers to the address tuple2, it creates a structure into which it writes the address tuple1, tuple1.xid, length tuple1 (well, and other information that is needed), puts this structure in the hash table by key tuple2 addresses.

VACUUM reaches tuple2, checks the address of tuple2 in the hash table - if it finds it, it evaluates the connection between them and makes a decision on cleaning.

regards

03.11.2019, 02:20, "Tomas Vondra" <tomas.vondra@2ndquadrant.com>:

On Sat, Nov 02, 2019 at 11:35:09PM +0300, Павел Ерёмин wrote:
   The proposed option is not much different from what it is now.
   We are not trying to save some space - we will reuse the existing one. We
   just work in 64 bit transaction counters. Correct me if I'm wrong - the
   address of the next version of the line is stored in the 6 byte field
   t_cid in the tuple header - which is not attached to the current page in
   any way - and can be stored anywhere in the table. Nothing changes.

I think you mean t_ctid, not t_cid (which is a 4-byte CommandId, not any
sort of item pointer).

I think this comment from htup_details.h explains the issue:

* ... Beware however that VACUUM might
* erase the pointed-to (newer) tuple before erasing the pointing (older)
* tuple. Hence, when following a t_ctid link, it is necessary to check
* to see if the referenced slot is empty or contains an unrelated tuple.
* Check that the referenced tuple has XMIN equal to the referencing tuple's
* XMAX to verify that it is actually the descendant version and not an
* unrelated tuple stored into a slot recently freed by VACUUM. If either
* check fails, one may assume that there is no live descendant version.

Now, imagine you have a tuple that gets updated repeatedly (say, 3x) and
each version gets to a different page. Say, pages #1, #2, #3. And then
VACUUM happens on some of the "middle" page (this may happen when trying
to fit new row onto a page to allow HOT, but it might happen even during
regular VACUUM).

So we started with 3 tuples on pages #1, #2, #3, but now we have this

  #1 - tuple exists, points to tuple on page #2
  #2 - tuple no longer exists, cleaned up by vacuum
  #3 - tuple exists

The scheme you proposed requires existence of all the tuples in the
chain to determine visibility. When tuple #2 no longer exists, it's
impossible to decide whether tuple on page #1 is visible or not.

This also significantly increases the amount of random I/O, pretty much
by factor of 2, because whenever you look at a row, you also have to
look at the "next version" which may be on another page. That's pretty
bad, bot for I/O and cache hit ratio. I don't think that's a reasonable
trade-off (at least compared to simply making the XIDs 64bit).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services