Re: index bloat

From: "David Esposito" <pgsql-general(at)esposito(dot)newnetco(dot)com>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: index bloat
Date: 2005-07-13 13:50:46
Message-ID: 200507131350.j6DDoloC019269@relay1.nnco.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

First, thank you for spending so much time on this issue

Second, I think I might have found a good lead ... I replicated the test you
described below (minus the updating of 10% of the records) ... I've attached
the PHP script (I'm more proficient at writing PHP than a shell script; you
should be able to run it from the command line ('php -q bloat.test.php') as
long as you've created a db named 'test' first)

You're right that the index behavior is well-behaved with the cycle of
INSERT / DELETE / VACUUM ... But while it was running, I started a second
session to the database after the 60th iteration and did

BEGIN;
SELECT COUNT(*) FROM bigboy;
ROLLBACK;

During my transaction, I saw the relpages charge upwards steadily until I
issued the ROLLBACK .. but even after the ROLLBACK (and even after closing
the second DB connection), the pages weren't reclaimed on the next VACUUM

This sounds exactly like what may be happening to us ... This is a 24x7 app
so during the course of the VACUUM there are probably some open transactions
... Over the past 4 or 5 weeks, that could explain the steady unlimited
growth ...

Another funny thing to note: I was able to cause this same behavior if I did
the following:

psql template1
BEGIN;
SELECT COUNT(*) FROM pg_database;
ROLLBACK;
\q

FYI, I'm using the 8.0.1 RPM build for RHEL3 (2PGDG)

-dave

> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: Tuesday, July 12, 2005 2:57 PM
> To: David Esposito
> Cc: pgsql-general(at)postgresql(dot)org
> Subject: Re: FW: [GENERAL] index bloat
>
> "David Esposito" <pgsql-general(at)esposito(dot)newnetco(dot)com> writes:
> >> BTW, the tail of the VACUUM VERBOSE output ought to have
> >> something about
> >> overall usage of the FSM --- what does that look like?
>
> > INFO: free space map: 528 relations, 172357 pages stored;
> 170096 total
> > pages needed
> > DETAIL: Allocated FSM size: 10000 relations + 1000000
> pages = 6511 kB
> > shared memory.
>
> OK, so you are definitely not running out of FSM slots...
>
> I spent some time this morning trying to reproduce the bloating
> behavior, without any success. I made a table with a plain "serial
> primary key" column, and ran a test program that did
>
> insert 10000 rows
> update about 10% of the rows at random
> if more than 500000 rows, delete the oldest 10000
> vacuum
> repeat
>
> which is intended to emulate your daily cycle with about one-tenth
> as much data (just to keep the runtime reasonable). I didn't see
> any bloat at all: the index growth looked like
>
> INFO: index "t_pkey" now contains 450000 row versions in 1374 pages
> INFO: index "t_pkey" now contains 460000 row versions in 1404 pages
> INFO: index "t_pkey" now contains 470000 row versions in 1435 pages
> INFO: index "t_pkey" now contains 480000 row versions in 1465 pages
> INFO: index "t_pkey" now contains 490000 row versions in 1496 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1527 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1557 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1588 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1588 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1589 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1589 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1589 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1589 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1589 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1589 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1589 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1589 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1590 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1590 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1590 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1590 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1590 pages
> INFO: index "t_pkey" now contains 500000 row versions in 1590 pages
>
> and it never grew any larger than that even in several hundred "days".
>
> This test was against CVS tip, but I'm pretty certain the relevant
> algorithms were all the same in 7.4. So there is some important
> aspect in which this test does not replicate the conditions your
> index is seeing. Can you think of any way that I've missed capturing
> your usage pattern?
>
> regards, tom lane
>

Attachment Content-Type Size
bloat.test.php application/octet-stream 2.4 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tino Wildenhain 2005-07-13 13:53:57 Re: To Postgres or not
Previous Message Ron Mayer 2005-07-13 13:50:23 Re: Converting MySQL tinyint to PostgreSQL