Re: Combine Prune and Freeze records emitted by vacuum

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: Combine Prune and Freeze records emitted by vacuum
Date: 2024-03-22 23:09:30
Message-ID: 85b6968b-49f2-4243-86a6-9b4116e6439b@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 20/03/2024 21:17, Melanie Plageman wrote:
> Attached patch changes-for-0001.patch has a bunch of updated comments --
> especially for heapam_xlog.h (including my promised diagram). And a few
> suggestions (mostly changes that I should have made before).

New version of these WAL format changes attached. Squashed to one patch now.

> + // TODO: should we avoid this if we only froze? heap_xlog_freeze() doesn't
> + // do it

Ah yes, that makes sense, did that.

> In the final commit message, I think it is worth calling out that all of
> those freeze logging functions heap_log_freeze_eq/cmp/etc are lifted and
> shifted from one file to another. When I am reviewing a big diff, it is
> always helpful to know where I need to review line-by-line.

Done.

>> From 5d6fc2ffbdd839e0b69242af16446a46cf6a2dc7 Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
>> Date: Wed, 20 Mar 2024 13:49:59 +0200
>> Subject: [PATCH v5 04/26] 'nplans' is a pointer
>>
>> I'm surprised the compiler didn't warn about this
>
> oops. while looking at this, I noticed that the asserts I added that
> nredirected > 0, ndead > 0, and nunused > 0 have the same problem.

Good catch, fixed.

>> - remove xlhp_conflict_horizon and store a TransactionId directly. A
>> separate struct would make sense if we needed to store anything else
>> there, but for now it just seems like more code.
>
> makes sense. I just want to make sure heapam_xlog.h makes it extra clear
> that this is happening. I see your comment addition. If you combine it
> with my comment additions in the attached patch for 0001, hopefully that
> makes it clear enough.

Thanks. I spent more time on the comments throughout the patch. And one
notable code change: I replaced the XLHP_LP_TRUNCATE_ONLY flag with
XLHP_CLEANUP_LOCK. XLHP_CLEANUP_LOCK directly indicates if you need a
cleanup lock to replay the record. It must always be set when
XLHP_HAS_REDIRECTIONS or XLHP_HAS_DEAD_ITEMS is set, because replaying
those always needs a cleanup lock. That felt easier to document and
understand than XLHP_LP_TRUNCATE_ONLY.

>> From 8af186ee9dd8c7dc20f37a69b34cab7b95faa43b Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
>> Date: Wed, 20 Mar 2024 14:03:06 +0200
>> Subject: [PATCH v5 07/26] Add comment to log_heap_prune_and_freeze().
>>
>> XXX: This should be rewritten, but I tried to at least list some
>> important points.
>
> Are you thinking that it needs to mention more things or that the things
> it mentions need more detail?

I previously just quickly jotted down things that seemed worth
mentioning in the comment. It was not so bad actually, but I reworded it
a little.

>> From b26e36ba8614d907a6e15810ed4f684f8f628dd2 Mon Sep 17 00:00:00 2001
>> From: Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi>
>> Date: Wed, 20 Mar 2024 14:53:31 +0200
>> Subject: [PATCH v5 08/26] minor refactoring in log_heap_prune_and_freeze()
>>
>> Mostly to make local variables more tightly-scoped.
>
> So, I don't think you can move those sub-records into the tighter scope.
> If you run tests with this applied, you'll see it crashes and a lot of
> the data in the record is wrong. If you move the sub-record declarations
> out to a wider scope, the tests pass.
>
> The WAL record data isn't actually copied into the buffer until
>
> recptr = XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_PRUNE_FREEZE);
>
> after registering everything.
> So all of those sub-records you made are out of scope the time it tries
> to copy from them.
>
> I spent the better part of a day last week trying to figure out what was
> happening after I did the exact same thing. I must say that I found the
> xloginsert API incredibly unhelpful on this point.

Oops. I had that in mind and that was actually why I moved the
XLogRegisterData() call to the end of the function, because I found it
confusing to register the struct before filling it in completely, even
though it works perfectly fine. But then I missed it anyway when I moved
the local variables. I added a brief comment on that.

> I would like to review the rest of the suggested changes in this patch
> after we fix the issue I just mentioned.

Thanks, review is appreciated. I feel this is ready now, so barring any
big new issues, I plan to commit this early next week.

There is another patch in the commitfest that touches this area:
"Recording whether Heap2/PRUNE records are from VACUUM or from
opportunistic pruning" [1]. That actually goes in the opposite direction
than this patch. That patch wants to add more information, to show
whether a record was emitted by VACUUM or on-access pruning, while this
patch makes the freezing and VACUUM's 2nd phase records also look the
same. We could easily add more flags to xl_heap_prune to distinguish
them. Or assign different xl_info code for them, like that other patch
proposed. But I don't think that needs to block this patch, that can be
added as a separate patch.

[1]
https://www.postgresql.org/message-id/CAH2-Wzmsevhox+HJpFmQgCxWWDgNzP0R9F+VBnpOK6TgxMPrRw@mail.gmail.com

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
v6-0001-Merge-prune-freeze-and-vacuum-WAL-record-formats.patch text/x-patch 55.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-03-22 23:34:15 Re: session username in default psql prompt?
Previous Message Andrew Dunstan 2024-03-22 21:34:18 Re: session username in default psql prompt?