Re: 64-bit XIDs in deleted nbtree pages

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: 64-bit XIDs in deleted nbtree pages
Date: 2021-03-02 04:06:02
Message-ID: CAD21AoCL9DZkUEOt4a98+QPG5GV=Kbqm-nu2D3yoOw5H=pe4AQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 2, 2021 at 6:40 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Sun, Feb 28, 2021 at 8:08 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > Even though "the letter of the law" favors removing the
> > > vacuum_cleanup_index_scale_factor GUC + param in the way I have
> > > outlined, that is not the only thing that matters -- we must also
> > > consider "the spirit of the law".
>
> > > I suppose I could ask Tom what he thinks?
> >
> > +1
>
> Are you going to start a new thread, or should I?

Ok, I'll start a new thread soon.

>
> > Since it seems not a bug I personally think we don't need to do
> > anything for back branches. But if we want not to trigger an index
> > scan by vacuum_cleanup_index_scale_factor, we could change the default
> > value to a high value (say, to 10000) so that it can skip an index
> > scan in most cases.
>
> One reason to remove vacuum_cleanup_index_scale_factor in the back
> branches is that it removes any need to fix the
> "IndexVacuumInfo.num_heap_tuples is inaccurate outside of
> btvacuumcleanup-only VACUUMs" bug -- it just won't matter if
> btm_last_cleanup_num_heap_tuples is inaccurate anymore. (I am still
> not sure about backpatch being a good idea, though.)

I think that removing vacuum_cleanup_index_scale_factor in the back
branches would affect the existing installation much. It would be
better to have btree indexes not use this parameter while not changing
the contents of meta page. That is, just remove the check related to
vacuum_cleanup_index_scale_factor from _bt_vacuum_needs_cleanup(). And
I personally prefer to fix the "IndexVacuumInfo.num_heap_tuples is
inaccurate outside of btvacuumcleanup-only VACUUMs" bug separately.

>
> > > Another new concern for me (another concern unique to Postgres 13) is
> > > autovacuum_vacuum_insert_scale_factor-driven autovacuums.
> >
> > IIUC the purpose of autovacuum_vacuum_insert_scale_factor is
> > visibility map maintenance. And as per this discussion, it seems not
> > necessary to do an index scan in btvacuumcleanup() triggered by
> > autovacuum_vacuum_insert_scale_factor.
>
> Arguably the question of skipping scanning the index should have been
> considered by the autovacuum_vacuum_insert_scale_factor patch when it
> was committed for Postgres 13 -- but it wasn't. There is a regression
> that was tied to autovacuum_vacuum_insert_scale_factor in Postgres 13
> by Mark Callaghan, which I suspect is relevant:
>
> https://smalldatum.blogspot.com/2021/01/insert-benchmark-postgres-is-still.html
>
> The blog post says: "Updates - To understand the small regression
> mentioned above for the l.i1 test (more CPU & write IO) I repeated the
> test with 100M rows using 2 configurations: one disabled index
> deduplication and the other disabled insert-triggered autovacuum.
> Disabling index deduplication had no effect and disabling
> insert-triggered autovacuum resolves the regression."
>
> This is quite specifically with an insert-only workload, with 4
> indexes (that's from memory, but I'm pretty sure it's 4). I think that
> the failure to account for skipping index scans is probably the big
> problem here. Scanning the heap to set VM bits is unlikely to be
> expensive compared to the full index scans. An insert-only workload is
> going to find most of the heap blocks it scans to set VM bits in
> shared_buffers. Not so for the indexes.
>
> So in Postgres 13 we have this autovacuum_vacuum_insert_scale_factor
> issue, in addition to the deduplication + btvacuumcleanup issue we
> talked about (the problems left by my Postgres 13 bug fix commit
> 48e12913). These two issues make removing
> vacuum_cleanup_index_scale_factor tempting, even in the back branches
> -- it might actually be the more conservative approach, at least for
> Postgres 13.

Yeah, this argument makes sense to me. The default values of
autovacuum_vacuum_insert_scale_factor/threshold are 0.2 and 1000
respectively whereas one of vacuum_cleanup_index_scale_factor is 0.1.
It means that in insert-only workload with default settings,
autovacuums triggered by autovacuum_vacuum_insert_scale_factor always
scan the all btree index to update the index statistics. I think most
users would not expect this behavior. As I mentioned above, I think we
can have nbtree not use this parameter or increase the default value
of vacuum_cleanup_index_scale_factor in back branches.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2021-03-02 04:28:32 Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?
Previous Message Amit Kapila 2021-03-02 04:03:37 Re: repeated decoding of prepared transactions