Re: Combine Prune and Freeze records emitted by vacuum

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Combine Prune and Freeze records emitted by vacuum
Date: 2024-03-24 16:32:46
Message-ID: CAAKRu_ZCjjUJCJ35pRBMUQ_NOfvqweWY21R=FRxRsx6PCW46bQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 21, 2024 at 9:28 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> In heap_page_prune_and_freeze(), we now do some extra work on each live
> tuple, to set the all_visible_except_removable correctly. And also to
> update live_tuples, recently_dead_tuples and hastup. When we're not
> freezing, that's a waste of cycles, the caller doesn't care. I hope it's
> enough that it doesn't matter, but is it?

Last year on an early version of the patch set I did some pgbench
tpcb-like benchmarks -- since there is a lot of on-access pruning in
that workload -- and I don't remember it being a showstopper. The code
has changed a fair bit since then. However, I think it might be safer
to pass a flag "by_vacuum" to heap_page_prune_and_freeze() and skip
the rest of the loop after heap_prune_satisifies_vacuum() when
on-access pruning invokes it. I had avoided that because it felt ugly
and error-prone, however it addresses a few other of your points as
well.

For example, I think we should set a bit in the prune/freeze WAL
record's flags to indicate if pruning was done by vacuum or on access
(mentioned in another of your recent emails).

> The first commit (after the WAL format changes) changes the all-visible
> check to use GlobalVisTestIsRemovableXid. The commit message says that
> it's because we don't have 'cutoffs' available, but we only care about
> that when we're freezing, and when we're freezing, we actually do have
> 'cutoffs' in HeapPageFreeze. Using GlobalVisTestIsRemovableXid seems
> sensible anyway, because that's what we use in
> heap_prune_satisfies_vacuum() too, but just wanted to point that out.

Yes, this is true. If we skip this part of the loop when on-access
pruning invokes it, we can go back to using the OldestXmin. I have
done that as well as some other changes in the attached patch,
conflict_horizon_updates.diff. Note that this patch may not apply on
your latest patch as I wrote it on top of an older version. Switching
back to using OldestXmin for page visibility determination makes this
patch set more similar to master as well. We could keep the
alternative check (with GlobalVisState) to maintain the illusion that
callers passing by_vacuum as True can pass NULL for pagefrz, but I was
worried about the extra overhead.

It would be nice to pick a single reasonable visibility horizon (the
oldest running xid we compare things to) at the beginning of
heap_page_prune_and_freeze() and use it for determining if we can
remove tuples, if we can freeze tuples, and if the page is all
visible. It makes it confusing that we use OldestXmin for freezing and
setting the page visibility in the VM and GlobalVisState for
determining prunability. I think using GlobalVisState for freezing has
been discussed before and ruled out for a variety of reasons, and I
definitely don't suggest making that change in this patch set.

> The 'frz_conflict_horizon' stuff is still fuzzy to me. (Not necessarily
> these patches's fault). This at least is wrong, because Max(a, b)
> doesn't handle XID wraparound correctly:
>
> > if (do_freeze)
> > conflict_xid = Max(prstate.snapshotConflictHorizon,
> > presult->frz_conflict_horizon);
> > else
> > conflict_xid = prstate.snapshotConflictHorizon;
>
> Then there's this in lazy_scan_prune():
>
> > /* Using same cutoff when setting VM is now unnecessary */
> > if (presult.all_frozen)
> > presult.frz_conflict_horizon = InvalidTransactionId;
> This does the right thing in the end, but if all the tuples are frozen
> shouldn't frz_conflict_horizon already be InvalidTransactionId? The
> comment says it's "newest xmin on the page", and if everything was
> frozen, all xmins are FrozenTransactionId. In other words, that should
> be moved to heap_page_prune_and_freeze() so that it doesn't lie to its
> caller. Also, frz_conflict_horizon is only set correctly if
> 'all_frozen==true', would be good to mention that in the comments too.

Yes, this is a good point. I've spent some time swapping all of this
back into my head. I think we should change the names of all these
conflict horizon variables and introduce some local variables again.
In the attached patch, I've updated the name of the variable in
PruneFreezeResult to vm_conflict_horizon, as it is only used for
emitting a VM update record. Now, I don't set it until the end of
heap_page_prune_and_freeze(). It is only updated from
InvalidTransactionId if the page is not all frozen. As you say, if the
page is all frozen, there can be no conflict.

I've also changed PruneState->snapshotConflictHorizon to
PruneState->latest_xid_removed.

And I introduced the local variables visibility_cutoff_xid and
frz_conflict_horizon. I think it is important we distinguish between
the latest xid pruned, the latest xmin of tuples frozen, and the
latest xid of all live tuples on the page.

Though we end up using visibility_cutoff_xid as the freeze conflict
horizon if the page is all frozen, I think it is more clear to have
the three variables and name them what they are. Then, we calculate
the correct one for the combined WAL record before emitting it. I've
done that in the attached. I've tried to reduce the scope of the
variables as much as possible to keep it as clear as I could.

I think I've also fixed the issue with using Max() to compare
TransactionIds by using TransactionIdFollows() instead.

Note that I still don't think we have a resolution on what to
correctly update new_relfrozenxid and new_relminmxid to at the end
when presult->nfrozen == 0 and presult->all_frozen is true.

if (presult->nfrozen > 0)
{
presult->new_relfrozenxid = pagefrz->FreezePageRelfrozenXid;
presult->new_relminmxid = pagefrz->FreezePageRelminMxid;
}
else
{
presult->new_relfrozenxid = pagefrz->NoFreezePageRelfrozenXid;
presult->new_relminmxid = pagefrz->NoFreezePageRelminMxid;
}

- Melanie

Attachment Content-Type Size
conflict_horizon_updates.diff text/x-patch 12.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-03-24 16:53:36 Re: [PoC] Improve dead tuple storage for lazy vacuum
Previous Message Tristan Partin 2024-03-24 15:42:53 Re: make dist using git archive