Re: Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alexey Bashtanov <bashtanov(at)imap(dot)cc>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
Date: 2014-10-19 16:41:34
Message-ID: 20141019164134.GK22660@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-10-18 21:36:48 -0500, Jim Nasby wrote:
> On 10/9/14, 4:19 PM, Andres Freund wrote:
> >On 2014-10-09 18:16:46 -0300, Alvaro Herrera wrote:
> >>>Andres Freund wrote:
> >>>> >On 2014-10-09 18:03:00 -0300, Alvaro Herrera wrote:
> >>>>> > >Bruce Momjian wrote:
> >>>>> > >
> >>>>>> > > >I agree this is a serious problem. We have discussed various options,
> >>>>>> > > >but have not decided on anything. The TODO list has:
> >>>>>> > > >
> >>>>>> > > > https://wiki.postgresql.org/wiki/Todo
> >>>>>> > > >
> >>>>>> > > > Improve setting of visibility map bits for read-only and insert-only
> >>>>>> > > > workloads
> >>>>>> > > >
> >>>>>> > > > http://www.postgresql.org/message-id/20130906001437.GA29264@momjian.us
> >>>>> > >
> >>>>> > >I hate to repeat myself, but I think autovacuum could be modified to run
> >>>>> > >actions other than vacuum and analyze. In this specific case we could
> >>>>> > >be running a table scan that checks only pages that don't have the
> >>>>> > >all-visible bit set, and see if it can be set.
> >>>> >
> >>>> >Isn't that*precisely* what a plain vacuum run does?
> >>>
> >>>Well, it also scans for dead tuples, removes them, and needs to go
> >>>through indexes to remove their references.

> >IIRC it doesn't do most of that if that there's no need. And if it's a
> >insert only table without rollbacks. I*do* think there's some
> >optimizations we could make in general.
>
> No, it always attempts dead tuple removal.

I said some steps, not all steps. Check it out:

/* If any tuples need to be deleted, perform final vacuum cycle */
/* XXX put a threshold on min number of tuples here? */
if (vacrelstats->num_dead_tuples > 0)
{
/* Log cleanup info before we touch indexes */
vacuum_log_cleanup_info(onerel, vacrelstats);

/* Remove index entries */
for (i = 0; i < nindexes; i++)
lazy_vacuum_index(Irel[i],
&indstats[i],
vacrelstats);
/* Remove tuples from heap */
lazy_vacuum_heap(onerel, vacrelstats);
vacrelstats->num_index_scans++;
}

There's rub here though. We unconditionally do:
/* Do post-vacuum cleanup and statistics update for each index */
for (i = 0; i < nindexes; i++)
lazy_cleanup_index(Irel[i], indstats[i], vacrelstats);

and that's not particularly cheap. Maybe we should make that conditional
when there's been no lazy_vacuum_index/heap calls at all?

> The "weird" part is that if it's not doing a freeze it will just punt
> on a page if it can't get the cleanup lock.

I don't think that's particularly wierd. Otherwise vacuum can get stuck
behind a single very hot page - leading to much, much more bloat.

> I have to believe that could seriously screw up autovacuum scheduling.

Why?

> Now that we have forks, I'm wondering if it would be best to come up
> with a per-page system that could be used to determine when a table
> needs background work to be done. The visibility map could serve a lot
> of this purpose, but I'm not sure if it would work for getting hint
> bits set in the background.

It would. Per definition, all tuples that are 'all visible' need to be
fully hint bitted.

> I think it would also be a win if we had a way to advance relfrozenxid
> and relminmxid. Perhaps something that simply remembered the last XID
> that touched each page...

Not sure what you're getting at here?

I think the big missing piece lest something like Heikki's xid lsn
ranges thing gets finished is a freeze map.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-10-19 16:50:30 Re: Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables
Previous Message Tom Lane 2014-10-19 16:38:52 Re: Hide 'Execution time' in EXPLAIN (COSTS OFF)