Re: Why mark empty pages all visible?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Why mark empty pages all visible?
Date: 2023-03-28 19:01:38
Message-ID: 20230328190138.rlb4xwi5qfvcvf66@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2023-03-27 21:51:09 -0700, Peter Geoghegan wrote:
> On Mon, Mar 27, 2023 at 9:32 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > You haven't said anything about the leading cause of marking empty
> > > pages all-frozen, by far: lazy_vacuum_heap_page(). Should it also stop
> > > marking empty pages all-frozen?
> >
> > It's not obvious that it should - but it's not as clear a case as the
> > ConditionalLockBufferForCleanup() -> lazy_scan_new_or_empty() one. In the
> > latter, we know
> > a) that we don't have to do any work to be able to advance the horizon
> > b) we know that somebody else has the page pinned
> >
> > What's the point in marking it all-visible at that point? In quite likely be
> > from RelationGetBufferForTuple() having extended the relation and then briefly
> > needed to release the lock (to acquire the lock on otherBuffer or in
> > GetVisibilityMapPins()).
>
> I think that there is significant value in avoiding special cases, on
> general principle. If we stopped doing this in
> lazy_scan_new_or_empty() we'd be inventing a new thing that cleanup
> locks are supposed to protect against. Maybe something like that would
> make sense, but if so then make that argument, and make it explicitly
> represented in the code.

I will probably make that argument - so far I was just trying to understand
the intent of the current code. There aren't really comments explaining why we
want to mark currently-pinned empty/new pages all-visible and enter them into
the FSM.

Historically we did *not* enter currently pinned empty/new pages into the FSM
/ VM. Afaics that's new as of 44fa84881fff.

The reason I'm looking at this is that there's a lot of complexity at the
bottom of RelationGetBufferForTuple(), related to needing to release the lock
on the newly extended page and then needing to recheck whether there still is
free space on the page. And that it's not complicated enough
(c.f. INSERT_FROZEN doing visibilitymap_pin() with the page locked).

As far as I can tell, if we went back to not entering new/empty pages into
either VM or FSM, we could rely on the page not getting filled while just
pinning, not locking it.

> > I don't follow. In the ConditionalLockBufferForCleanup() ->
> > lazy_scan_new_or_empty() case we are dealing with an new or empty
> > page. Whereas lazy_vacuum_heap_page() deals with a page that definitely
> > has dead tuples on it. How are those two cases comparable?
>
> It doesn't have dead tuples anymore, though.
>
> ISTM that there is an issue here with the definition of an empty page.
> You're concerned about PageIsEmpty() pages.

And not just any PageIsEmpty() page, ones that are currently pinned.

I also do wonder whether the different behaviour of PageIsEmpty() and
PageIsNew() pages makes sense. The justification for not marking PageIsNew()
pages as all-visible holds just as true for empty pages. There's just as much
free space there.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2023-03-28 19:03:47 Re: Schema variables - new implementation for Postgres 15
Previous Message Gregory Stark (as CFM) 2023-03-28 18:58:25 Re: [PATCH]Feature improvement for MERGE tab completion