Re: Rename dead_tuples to dead_items in vacuumlazy.c

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Rename dead_tuples to dead_items in vacuumlazy.c
Date: 2021-12-01 05:34:39
Message-ID: CAD21AoCYdn9n+-ZBD_WyJq-4Ws=E6rErcNDsB_1DTfDJ-DzwDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 1, 2021 at 4:42 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Mon, Nov 29, 2021 at 7:00 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > Thanks! I'll change my parallel vacuum refactoring patch accordingly.
>
> Thanks again for working on that.
>
> > Regarding the commit, I think that there still is one place in
> > lazyvacuum.c where we can change "dead tuples” to "dead items”:
> >
> > /*
> > * Allocate the space for dead tuples. Note that this handles parallel
> > * VACUUM initialization as part of allocating shared memory space used
> > * for dead_items.
> > */
> > dead_items_alloc(vacrel, params->nworkers);
> > dead_items = vacrel->dead_items;
>
> Oops. Pushed a fixup for that just now.

Thanks!

>
> > Also, the commit doesn't change both PROGRESS_VACUUM_MAX_DEAD_TUPLES
> > and PROGRESS_VACUUM_NUM_DEAD_TUPLES. Did you leave them on purpose?
>
> That was deliberate.
>
> It would be a bit strange to alter these constants without also
> updating the corresponding column names for the
> pg_stat_progress_vacuum system view. But if I kept the definition from
> system_views.sql in sync, then I would break user scripts -- for
> reasons that users don't care about. That didn't seem like the right
> approach.

Agreed.

>
> Also, the system as a whole still assumes "DEAD tuples and LP_DEAD
> items are the same, and are just as much of a problem in the table as
> they are in each index". As you know, this is not really true, which
> is an important problem for us. Fixing it (perhaps as part of adding
> something like Robert's conveyor belt design) will likely require
> revising this model quite fundamentally (e.g, the vacthresh
> calculation in autovacuum.c:relation_needs_vacanalyze() would be
> replaced). When this happens, we'll probably need to update system
> views that have columns with names like "dead_tuples" -- because maybe
> we no longer specifically count dead items/tuples at all. I strongly
> suspect that the approach to statistics that we take for pg_statistic
> optimizer stats just doesn't work for dead items/tuples -- statistical
> sampling only produces useful statistics for the optimizer because
> certain delicate assumptions are met (even these assumptions only
> really work with a properly normalized database schema).
>
> Maybe revising the model used for autovacuum scheduling wouldn't
> include changing pg_stat_progress_vacuum, since that isn't technically
> "part of the model" --- I'm not sure. But it's not something that I am
> in a hurry to fix.

Understood.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-12-01 05:46:45 Re: pg_replslotdata - a tool for displaying replication slot information
Previous Message Masahiko Sawada 2021-12-01 05:23:29 Re: Skipping logical replication transactions on subscriber side