Re: index bloat

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David Esposito" <pgsql-general(at)esposito(dot)newnetco(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: index bloat
Date: 2005-07-13 19:44:57
Message-ID: 20976.1121283897@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"David Esposito" <pgsql-general(at)esposito(dot)newnetco(dot)com> writes:
> Hmm, if I keep running the following query while the test program is going
> (giving it a few iterations to rest between executions), the steady-state
> usage of the indexes seems to go up ... it doesn't happen every time you run
> the query, but if you do it 10 times, it seems to go up at least once every
> few times you run it .. And the usage keeps charging upwards long after the
> UPDATE query finishes (at least 3 or 4 iterations afterwards until it levels
> off again) ... It would seem like the steady-state should be reached after
> the first couple of runs and then never creep up any further because there
> should be enough slack in the index, right?

> UPDATE bigboy SET creation_date = CURRENT_TIMESTAMP
> WHERE creation_date BETWEEN CURRENT_TIMESTAMP - INTERVAL '15 seconds'
> AND CURRENT_TIMESTAMP - INTERVAL '5 seconds';

Hmm, this is preferentially touching stuff near the right end of the
index, ie, it's going to bloat the pages associated with higher keys.
As I understand your usage of these indexes, pages generally only get
reclaimed off the left end (as records get old enough to be moved to
archival storage). So if you were to repeat this test for long enough
for the data to cycle all the way through the table and out again
(50 days in your real-world usage) then the extra space would be
evenly distributed and the usage would reach a steady state.

The testing I've been doing so far involves UPDATEs that touch a
uniformly distributed subset of the table --- maybe that's the aspect
that is failing to match your reality. Do you mostly update
recently-added rows? Can you quantify the effect at all?

> Is there any way to disassemble an index (either through some fancy SQL
> query or by running the actual physical file through a tool) to get an idea
> on where the slack could be accumulating?

No such code exists AFAIK, though the idea is sounding pretty attractive
at the moment ;-). You could get some crude numbers by adding debug
printouts to btbulkdelete() ...

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joshua D. Drake 2005-07-13 19:46:32 Re: Transparent encryption in PostgreSQL?
Previous Message Matt McNeil 2005-07-13 19:21:40 Transparent encryption in PostgreSQL?