Re: Rename dead_tuples to dead_items in vacuumlazy.c

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Rename dead_tuples to dead_items in vacuumlazy.c
Date: 2021-11-24 17:06:58
Message-ID: CAH2-WzkjZOfAa3d0fjN7EhtL1WShd9naX9Eb4_XGF-p_OsgUsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 24, 2021 at 7:16 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> Sorry to reply to myself, but I realized that I forgot to return to the
> main point of this thread. If we agree that "an LP_DEAD item pointer
> does not point to any item" (an assertion that gives a precise meaning
> to both those terms), then a patch that renames "tuples" to "items" is
> not doing anything useful IMO, because those two terms are synonyms.

TIDs (ItemPointerData structs) are of course not the same thing as
line pointers (ItemIdData structs). There is a tendency to refer to
the latter as "item pointers" all the same, which was confusing. I
personally corrected/normalized this in commit ae7291ac in 2019. I
think that it's worth being careful about precisely because they're
closely related (but distinct) concepts. And so FWIW "LP_DEAD item
pointer" is not a thing. I agree that an LP_DEAD item pointer has no
tuple storage, and so you could say that it points to nothing (though
only in heapam). I probably would just say that it has no tuple
storage, though.

> Now maybe Peter doesn't agree with the definitions I suggest, in which
> case I would like to know what his definitions are.

I agree with others that the term "item" is vague, but I don't think
that that's necessarily a bad thing here -- I deliberately changed the
comments to say either "TIDs" or "LP_DEAD items", emphasizing whatever
the important aspect seemed to be in each context (they're LP_DEAD
items to the heap structure, TIDs to index structures).

I'm not attached to the term "item". To me the truly important point
is what these items are *not*: they're not tuples. The renaming is
intended to enforce the concepts that I went into at the end of the
commit message for commit 8523492d. Now the pruning steps in
lazy_scan_prune always avoiding keeping around a DEAD tuple with tuple
storage on return to lazy_scan_heap (only LP_DEAD items can remain),
since (as of that commit) lazy_scan_prune alone is responsible for
things involving the "logical database".

This means that index vacuuming and heap vacuuming can now be thought
of as removing garbage items from physical data structures (they're
purely "physical database" concepts), and nothing else. They don't
need recovery conflicts. How could they? Where are you supposed to get
the XIDs for that from, when you've only got LP_DEAD items?

This is also related to the idea that pruning by VACUUM isn't
necessarily all that special compared to earlier pruning or concurrent
opportunistic pruning. As I go into on the other recent thread on
removing special cases in vacuumlazy.c, ISTM that we ought to do
everything except pruning itself (and freezing tuples, which
effectively depends on pruning) without even acquiring a cleanup lock.
Which is actually quite a lot of things.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2021-11-24 17:15:07 Re: Split xlog.c
Previous Message Heikki Linnakangas 2021-11-24 16:38:53 Re: Minor documentation fix - missing blank space