Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: 河田達也 <kawatatatsuya0913(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Add memory usage reporting to VACUUM VERBOSE
Date: 2025-11-20 00:19:44
Message-ID: CAD21AoD4VYa60i1jF2ndBMc5g+NjNnkQXS4zRDf23hzCkT4fZw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 19, 2025 at 9:31 AM 河田達也 <kawatatatsuya0913(at)gmail(dot)com> wrote:
>
> Hi Sawada-san,
>
> Thank you very much for your detailed review and suggestions for the v2 patch.
> Also, thank you for pointing out the top-posting issue. I'll follow the mailing list style from now on.

+1

>
> ---
> >+ /* Report memory usage for dead_items tracking */
> >+ vac_work_mem = AmAutoVacuumWorkerProcess() &&
> >+ autovacuum_work_mem != -1 ?
> >+ autovacuum_work_mem : maintenance_work_mem;
> >
> >We can use vacrel->dead_items_info->max_bytes instead of calculating it again.
>
> I changed it to retrieve the value from max_bytes.
> However, I encountered a segmentation fault during parallel VACUUM, so I modified the code to store the value before resetting it.
> I would appreciate any suggestions if there is a better approach.

Right. But the dead_items_max_bytes is used in the same function
heap_vacuum_rel(), so we don't need to have it in LVRelStats.

> ---
>
> >How about the following message style?
> >
> >memory usage: total 0.69 MB used across 1 index scans (max 64.00 MB at once)
>
> I would like to confirm one point:
> Does “1 index scans” refer to (1) the number of reset cycles during the VACUUM process, or (2) the number of indexes attached to the target relation?
> In my understanding, the latter seems more useful to users, so in the v3 patch I implemented it as the number of vacuumed indexes.
> If this interpretation is not correct, I will adjust it accordingly.

I meant (1) actually. I don't think the number of indexes attached to
the target relation is not relevant with the memory usage.

We could add a new counter to track how many times we had to reset the
dead items storage because it was full. The existing num_index_scans
counter only shows how many times we performed index vacuuming. This
means that when index vacuuming is disabled, we still collect dead
items but num_index_scans shows 0. Adding this new counter would help
users understand how often the dead items storage reaches capacity,
even when index vacuuming is turned off.

>
> ---
> >+/*
> >+ * Add current memory usage to the running total before resetting.
> >+ * This tracks cumulative memory across all index vacuum cycles.
> >+ */
> >+ if (vacrel->dead_items != NULL)
> >+ {
> >+ Size final_bytes = TidStoreMemoryUsage(vacrel->dead_items);
> >+
> >+ vacrel->total_dead_items_bytes += final_bytes;
> >+ }
> >
> >The comments need to be indented. I'd recommend running
> >src/tools/pgindent/pgindent to format the codes before creating the
> >patch.
> >
> >The above change doesn't cover some cases where index vacuuming is
> >disabled (see the first if statement in the same function). It happens
> >for example when the user specified INDEX_CLEANUP=off, we used the
> >bypassing index vacuuming optimization, or failsafe was dynamically
> >triggered. The proposed change covers the bypassing optimization but
> >doesn't for the other two cases, which seems inconsistent and needs to
> >be avoided. Another case we need to consider is where the table
> >doesn't have an index, where we don't collect dead items to the
> >TidStore. In this case, we always report that 0 bytes are used. Do we
> >want to report the memory usage also in this case?
>
> As for tracking memory usage, I separated the internal accounting from what is ultimately reported to the user,
> and therefore the implementation is slightly redundant, recording the memory usage at several decision points.

How about updating total bytes in dead_items_reset()?

>
> For the display behavior:
> Even with INDEX_CLEANUP = off, or when do_index_vacuuming = false due to failsafe, memory usage may not be strictly zero.
> Therefore, I believe it is still appropriate to report the memory usage in these cases.
>
> On the other hand, when the relation has no indexes at all, no memory is allocated for dead_items, so printing a memory usage line seems unnecessary.
> In the v3 patch, I adjusted the code so that no message is emitted in this specific case.

I guess that it doesn't necessary omit such information even though we
always show 0 bytes used.Explicitly showing 0 bytes used could help
users to understand the vacuum behavior.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2025-11-20 00:22:39 Re: regarding statistics retaining with 18 Upgrade
Previous Message Fujii Masao 2025-11-20 00:15:59 Re: [PATCH] Add hints for invalid binary encoding names in encode/decode functions