Re: Vacuum statistics

From: Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jim Nasby <jnasby(at)upgrade(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, Andrei Zubkov <zubkov(at)moonset(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, a(dot)lepikhov(at)postgrespro(dot)ru, Sami Imseih <samimseih(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: Vacuum statistics
Date: 2025-06-02 19:56:41
Message-ID: 18169b68-5b10-40fd-9657-be04f2bd0161@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02.06.2025 19:25, Alexander Korotkov wrote:
> On Tue, May 13, 2025 at 12:49 PM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>> On 12.05.2025 08:30, Amit Kapila wrote:
>>> On Fri, May 9, 2025 at 5:34 PM Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>>>> I did a rebase and finished the part with storing statistics separately from the relation statistics - now it is possible to disable the collection of statistics for relationsh using gucs and
>>>> this allows us to solve the problem with the memory consumed.
>>>>
>>> I think this patch is trying to collect data similar to what we do for
>>> pg_stat_statements for SQL statements. So, can't we follow a similar
>>> idea such that these additional statistics will be collected once some
>>> external module like pg_stat_statements is enabled? That module should
>>> be responsible for accumulating and resetting the data, so we won't
>>> have this memory consumption issue.
>> The idea is good, it will require one hook for the pgstat_report_vacuum
>> function, the extvac_stats_start and extvac_stats_end functions can be
>> run if the extension is loaded, so as not to add more hooks.
> +1
> Nice idea of a hook. Given the volume of the patch, it might be a
> good idea to keep this as an extension.

Today, I finalized the vacuum statistics separation approach and
refactored the vacuum statistics structures (patch 4).

I also reworked the table statistics to avoid mixing index statistics in
parallel vacuum mode (patch 2).

The new approach excludes buffer usage and WAL statistics for indexes
from the table’s statistics.
For timing, if vacuuming is sequential, the total time spent on all
indexes is subtracted from the table’s total vacuum time by adding up
the individual index vacuum times. If vacuuming is parallel, the total
index vacuum time is subtracted as a whole.

static void
accumulate_idxs_vacuum_statistics(LVRelState *vacrel, ExtVacReport
*extVacIdxStats)
{
    if (!pgstat_track_vacuum_statistics)
        return;

    /* Fill heap-specific extended stats fields */
    vacrel->extVacReportIdx.blk_read_time += extVacIdxStats->blk_read_time;
    vacrel->extVacReportIdx.blk_write_time +=
extVacIdxStats->blk_write_time;
    vacrel->extVacReportIdx.total_blks_dirtied +=
extVacIdxStats->total_blks_dirtied;
    vacrel->extVacReportIdx.total_blks_hit +=
extVacIdxStats->total_blks_hit;
    vacrel->extVacReportIdx.total_blks_read +=
extVacIdxStats->total_blks_read;
    vacrel->extVacReportIdx.total_blks_written +=
extVacIdxStats->total_blks_written;
    vacrel->extVacReportIdx.wal_bytes += extVacIdxStats->wal_bytes;
    vacrel->extVacReportIdx.wal_fpi += extVacIdxStats->wal_fpi;
    vacrel->extVacReportIdx.wal_records += extVacIdxStats->wal_records;
    vacrel->extVacReportIdx.delay_time += extVacIdxStats->delay_time;

    vacrel->extVacReportIdx.total_time += extVacIdxStats->total_time;

}

if (ParallelVacuumIsActive(vacrel))
{
    LVExtStatCounters counters;
    ExtVacReport extVacReport;

    memset(&extVacReport, 0, sizeof(ExtVacReport));

    extvac_stats_start(vacrel->rel, &counters);

    /* Outsource everything to parallel variant */
    parallel_vacuum_bulkdel_all_indexes(vacrel->pvs, old_live_tuples,
vacrel->num_index_scans);

    extvac_stats_end(vacrel->rel, &counters, &extVacReport);
    accumulate_idxs_vacuum_statistics(vacrel, &extVacReport);
}

Currently, database statistics work incorrectly — I'm investigating the
issue.

In parallel, I'm starting work on the extension.

--
Regards,
Alena Rybakina
Postgres Professional

Attachment Content-Type Size
0001-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 71.3 KB
0002-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 55.2 KB
0003-Machinery-for-grabbing-an-extended-vacuum-statistics.patch text/x-patch 31.2 KB
0004-Vacuum-statistics-have-been-separated-from-regular-r.patch text/x-patch 63.6 KB
0005-Add-documentation-about-the-system-views-that-are-us.patch text/x-patch 24.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Melanie Plageman 2025-06-02 19:59:48 Re: Correcting freeze conflict horizon calculation
Previous Message Nathan Bossart 2025-06-02 19:55:40 Re: pg_upgrade: warn about roles with md5 passwords