Re: [HACKERS] Resurrecting per-page cleaner for btree

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Jim Nasby <jnasby(at)pervasive(dot)com>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Greg Stark <gsstark(at)mit(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Csaba Nagy <nagy(at)ecircle-ag(dot)com>, postgres hackers <pgsql-hackers(at)postgresql(dot)org>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-patches(at)postgresql(dot)org, teramoto(dot)junji(at)lab(dot)ntt(dot)co(dot)jp
Subject: Re: [HACKERS] Resurrecting per-page cleaner for btree
Date: 2006-07-27 21:24:35
Message-ID: 87d5bq3fn0.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Jim Nasby <jnasby(at)pervasive(dot)com> writes:

> Even if we stopped right there it would still be a huge win in many (most?)
> cases. How often do the indexes on a table comprise even 50% of the table's
> size?

I would say they're usually roughly comparable actually. It depends on how
wide your table is of course but the wider your table rows the more indexes
you're likely to have on the table too.

> Even in the 50% case, you've gone from 1.5X to .6X

Sure, and a 3x speedup is nothing to sneeze at, that would be a great
improvement to vacuum. But it's still just a linear speedup and doesn't
address the algorithmic problem.

The fundamental problem is we have a process that's O(m) where m is the total
space taken by a table and its indexes. The actual amount of space it has to
reclaim is n. Other than n<m there's basically no relationship between these
figures. As long as that's the case vacuum may as well be O(n^2) or O(n!).

We frequently assume -- and often it's a valid assumption -- that these
figures are roughly proportional. Hence all the talk about databases reaching
a "steady-state" where the amount of dead space is constantly being reclaimed
at more or less the same speed it's being generated. But there are also plenty
of use cases where a complete vacuum pass takes thousands of times longer than
the i/o it took to generate those dead tuples. Currently Postgres just isn't
that great a tool for those use cases.

Unfortunately while I'm convinced of the problem I'm equally unconvinced of
the solution. I tried to solve online index builds using retail index lookups
in a very similar way to what's being discussed here. And ran into the same
problems. I eventually decided that while it could be made to work that way it
would be far too much code, far too unsafe, and far too invasive in the index
access methods to be the right approach.

Our existing method works with minimal help from the index access methods
which allows for an enormous degree of freedom in the index design.. To be
able to support retail vacuum you would have to force index method
implementors to keep information in a way that allowed them to look up a
particular value/tid efficiently which would limit the kinds of indexes you
could support drastically.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-07-27 21:47:40 Re: [COMMITTERS] pgsql: another try at keeping AIX/ppc
Previous Message Martijn van Oosterhout 2006-07-27 21:23:09 Re: GUC with units, details

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2006-07-27 22:02:27 Re: [HACKERS] extension for sql update
Previous Message Tom Lane 2006-07-27 20:29:53 Re: New shared memory hooks proposal (was Re: