Re: really lazy vacuums?

From: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <jim(at)nasby(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: really lazy vacuums?
Date: 2011-03-17 13:26:25
Message-ID: AANLkTimjRG53GHs3XV3Ui_Z7EmkMbWzU7JurrLMw3SM8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2011/3/17 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Wed, Mar 16, 2011 at 6:36 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>> One way to look at this is that any system will have a limit on how quickly it can vacuum everything. If it's having trouble dedicating enough IO to vacuum, then autovac is going to have a long list of tables that it wants to vacuum. When you're in that situation, you want to get to the next table that needs vacuuming as quickly as possible, so if you've run through the first heap scan and found only a limited number of dead tuples, it doesn't make sense to spend a bunch of time scanning indexes and making a second heap scan (though, IIRC the second scan doesn't hit the entire heap; it only hits the tuples that were remembered as being dead).
>
> I mostly agree with this, but you also can't postpone vacuuming
> indefinitely just because you're too busy; that's going to blow up in
> your face.
>
>> Of course, going along the lines of an autovac-based tuning mechanism, you have to question how a table would show up for autovac if there's not actually a number of dead tuples. One scenario is freezing (though I'm not sure if your super-lazy vacuum could freeze tuples or not). Another is inserts. That might become a big win; you might want to aggressively scan a table that gets data loaded into it in order to set hint/all visible bits.
>
> Right.  Really-lazy vacuum could freeze tuples.  Unlike regular
> vacuum, it can also sensibly be done incrementally.  One thing I was
> thinking about is counting the number of times that we fetched a tuple
> that was older than RecentGlobalXmin and had a committed xmin and an
> invalid xmax, but where the page was not PD_ALL_VISIBLE.  If that's
> happening a lot, it probably means that some vacuuming would speed
> things up, by getting those PD_ALL_VISIBLE bits set.  Perhaps you
> could work out some formula where you do a variable amount of
> super-lazy vacuuming depending on the number of such tuple fetches.
> The trick would be to avoid overdoing it (so that you swamp the I/O
> system) or underdoing it (so that the system never converges).  It
> would be really nice (for this and for other things) if we had some
> way of measuring the I/O saturation of the system, so that we could
> automatically adjust the aggressiveness of background processes
>

Yes. I am thinking of something like that (the IO saturation
measurement) to let the background writer try to work on hint bit when
it does not have so much to do, if IO ressources are ok.

>
> Note also that if and when we get index-only scans, making sure the
> PD_ALL_VISIBLE bits (and thus the visibility map bits) actually get
> set is going to be a lot more important.
>
>> From a manual standpoint, ISTM that super-lazy vac would be extremely useful for dealing with hint bits after a bulk insert to a table that also has some update activity. Using a regular vacuum in that case would result in a lot of extra work to deal with the small number of dead tuples.
>
> I can see that.
>
>> Perhaps it would be useful to write a script that analyzed the output of vacuum verbose looking for tables where a super-lazy vacuum would have made sense (assuming vacuum verbose provides the needed info). If we had such a script we could ask folks to run it and see how much super-lazy vacuuming would help in the real world.
>
> I'm a bit doubtful about this part.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-03-17 13:33:52 Re: Sync Rep and shutdown Re: Sync Rep v19
Previous Message Heikki Linnakangas 2011-03-17 12:24:43 Re: Sync Rep and shutdown Re: Sync Rep v19