Re: Why mark empty pages all visible?

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Why mark empty pages all visible?
Date: 2023-03-28 03:12:11
Message-ID: CAH2-WznMppTCOQNcTv-j9B=1s9Lwzd7Zod76=fsSrQS+tDvNQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 27, 2023 at 6:48 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> It seems odd that we enter the page into the VM at this point. That means that
> use of that page will now require a bit more work (including
> RelationGetBufferForTuple() pinning it).

I think that it's fairly obvious that it's *not* odd at all. If it
didn't do this then the pages would have to be scanned by VACUUM.

You haven't said anything about the leading cause of marking empty
pages all-frozen, by far: lazy_vacuum_heap_page(). Should it also stop
marking empty pages all-frozen?

Actually it isn't technically an empty page according to
PageIsEmpty(), since I wrote PageTruncateLinePointerArray() in a way
that made it leave a heap page with at least a single LP_UNUSED item.
But it'll essentially leave behind an empty page in many cases. The
regression tests mark pages all-frozen in this path quite a bit more
often than any other path according to gcov.

> This got a bit stranger with 44fa84881fff, because now we add the page into
> the VM even if it currently is pinned:

> It seems quite odd to set a page to all visible that we could not currently
> get a cleanup lock on - obviously evidence of another backend trying to to do
> something with the page.

You can say the same thing about lazy_vacuum_heap_page(), too.
Including the part about cleanup locking.

> It seems pretty clear that we shouldn't enter a currently-in-use page into the
> VM or freespacemap. All that's going to do is to "disturb" the backend trying
> to use that page (by directing other backends to it) and to make its job more
> expensive.

I don't think that it's clear. What about the case where there is only
one tuple, on a page that we cannot cleanup lock? Where do you draw
the line?

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2023-03-28 03:29:23 Re: SQL/JSON revisited
Previous Message Hayato Kuroda (Fujitsu) 2023-03-28 03:04:47 RE: PGdoc: add missing ID attribute to create_subscription.sgml