RE: [GENERAL] Benchmarks

From: "Culberson, Philip" <philip(dot)culberson(at)dat(dot)com>
To: "'Bruce Momjian'" <pgman(at)candle(dot)pha(dot)pa(dot)us>, Dustin Sallings <dustin(at)spy(dot)net>
Cc: The Hermit Hacker <scrappy(at)hub(dot)org>, pgsql-general(at)hub(dot)org
Subject: RE: [GENERAL] Benchmarks
Date: 2000-01-06 20:58:21
Message-ID: A95EFC3B707BD311986C00A0C9E95B6A04B3C4@datmail03.dat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-general

In his very insightful post last week, Mike Mascari pointed out that, on
tables with heavy insert/updates, it was much faster to drop the index,
vacuum analyze, and then rebuild the index. Maybe in vacuum there is a
specific inefficiency in what Mike coined "defragment"ing indexes.

[Snip]

8. Not running VACUUM - PostgreSQL won't use indexes, or won't optimize
correctly unless the record count and dispersion estimates are up-to-date.
People have reported problems with running vacuum while under heavy load. We
haven't seen it, but we run vacuum each night at 4:05 a.m. However, if you
perform a LARGE number of INSERTS/UPDATES, it is better for you to do the
following:

DROP INDEX index_on_heavilty_used_table;
VACUUM ANALYZE;
CREATE INDEX index_on_heavily_used_table;

Because VACUUM will sit there, and, row by row, essentially "defragment"
your indexes, which can take damn near forever for any number of updates or
deletes greater than, say, 30,000 rows.

[Snip]

-----Original Message-----
From: Bruce Momjian [mailto:pgman(at)candle(dot)pha(dot)pa(dot)us]
Sent: Thursday, January 06, 2000 10:14 AM
To: Dustin Sallings
Cc: The Hermit Hacker; pgsql-general(at)hub(dot)org
Subject: Re: [GENERAL] Benchmarks

> Untrue, vacuum is *extremely* important for updating statistics.
> If you have a lot of data in a table, and you have never vacuumed, you
> might as well not have any indices. It'd be nice if you could seperate
> the stat update from the storage reclaim. Actually, it'd be nice if you
> could reuse storage, so that an actual vacuum wouldn't be necessary unless
> you just wanted to free up disk space you might end up using again anyway.
>
> The vacuum also doesn't seem to be very efficient. In one of my
> databases, a vacuum could take in excess of 24 hours, while I've written a
> small SQL script that does a select rename and a insert into select from
> that will do the same job in about ten minutes. This is a database that
> cannot lock for more than a few minutes.

This is serious. Why would an INSERT / RENAME be so much faster. Are
we that bad with VACUUM?

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

************

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alain TESIO 2000-01-07 02:32:42 Re: MySQL / PostgreSQL (was: Postgres object orientation)
Previous Message Dan Linderman 2000-01-06 19:35:48 TIMESTAMP('now') y2k bug

Browse pgsql-general by date

  From Date Subject
Next Message The Hermit Hacker 2000-01-06 21:30:14 Re: [GENERAL] New Search Engine ... UdmSearch
Previous Message Ed Loehr 2000-01-06 20:45:56 Re: [GENERAL] New Search Engine ... UdmSearch