Re: Single pass vacuum - take 1

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Single pass vacuum - take 1
Date: 2011-07-14 15:46:55
Message-ID: CA+U5nMJ0tNQETUMNkw5GH0iLNJK13ZnRrzRuspsF5v8GnuPh+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 12, 2011 at 9:47 PM, Pavan Deolasee
<pavan(dot)deolasee(at)gmail(dot)com> wrote:

> http://archives.postgresql.org/pgsql-hackers/2011-05/msg01119.php
> PFA a patch which implements the idea with some variation.
> At the start of the first pass, we remember the current LSN. Every page that
> needs some work is HOT-pruned so that dead tuples are truncated to dead line
> pointers. We collect those dead line pointers and mark them as
> dead-vacuumed. Since we don't have any LP flag bits available, we instead
> just use the LP_DEAD flag along with offset value 1 to mark the line pointer
> as dead-vacuumed. The page is defragmented and we  store the LSN remembered
> at the start of the pass in the page special area as vacuum LSN. We also
> update the free space at that point because we are not going to do a second
> pass on the page anymore.
>
> Once we collect all dead line pointers and mark them as dead-vacuumed, we
> clean-up the indexes and remove all index pointers pointing to those
> dead-vacuumed line pointers. If the index vacuum finishes successfully, we
> store the LSN in the pg_class row of the table (needs catalog changes). At
> that point, we are certain that there are no index pointers pointing to
> dead-vacuumed line pointers and they can be reclaimed at the next
> opportunity.
>
> During normal operations or subsequent vacuum, if the page is chosen for
> HOT-prunung, we check if has any dead-vacuumed line pointers and if the
> vacuum LSN stored on the page special area is equal to the one stored in the
> pg_class row, and reclaim those dead-vacuum line pointers (the index
> pointers to these line pointers are already taken care of). If the pg_class
> LSN is not the same, the last vacuum probably did not finish completely and
> we collect the dead-vacuum line pointers just like other dead line pointers
> and try to clean up the index pointers as usual.
> I ran few pgbench tests with the patch. I don't see much difference in the
> overall tps, but the vacuum time for the accounts table reduces by nearly
> 50%. I neither see much difference in the overall bloat, but then pgbench
> uses HOT very nicely and the accounts table got only couple of vacuum cycles
> in my 7-8 hour run.
> There are couple of things that probably need more attention. I am not sure
> if we need to teach ANALYZE to treat dead line pointers differently. Since
> they take up much less space than a dead tuple, they should definitely have
> a lower weight, but at the same time, we need to take into account the
> number of indexes on the table. The start of first pass LSN that we are
> remembering is in fact the start of the WAL page and I think there could be
> some issues with that, especially for very tiny tables. For example, first
> vacuum may run completely. If another vacuum is started on the same table
> and say it gets the same LSN (because we did not write more than 1 page
> worth WAL in between) and if the second vacuum aborts after it cleaned up
> few pages, we might get into some trouble. The likelihood of such things
> happening is very small, but may be its worth taking care of it. May be we
> can get the exact current LSN and not store it in the pg_class if we don't
> do anything during the cycle.
> Comments ?

Hi Pavan,

I'd say that seems way too complex for such a small use case and we've
only just fixed the bugs from 8.4 vacuum map complexity. The code's
looking very robust now and I'm uneasy that such changes are really
worth it.

You're trying to avoid Phase 3, the second pass on the heap. Why not
avoid the write in Phase 1 if its clear that we'll need to come back
again in Phase 3? So we either do a write in Phase 1 or in Phase 3,
but never both? That minimises the writes, which are what hurt the
most.

We can reduce the overall cost simply by not doing Phase 2 and Phase 3
if the number of rows to remove is too few, say < 1%.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-07-14 15:51:14 Extension ownership and pg_dump
Previous Message Magnus Hagander 2011-07-14 15:00:51 Re: patch for distinguishing PG instances in event log