Several recent postings appear to confirm that there is an issue with
the use of VACUUM or ANALYZE on empty tables. Specifically, if you
VACUUM or ANALYZE a table that is empty and then insert a large number
of rows into this table, you will experience very poor performance.
For example, in our testing, we suffered a 15X performance penalty when
inserting 35,000 rows into a table that had been VACUUM'd or
ANALYZE'd when empty. Also, in our testing, it didn't matter whether
you just did VACCUM or VACUUM ANALYZE -- in both cases the
subsequent inserts were slow.
In the short run, the work-around appears to be either to avoid using
these commands on empty tables or to keep some "dummy" rows in these
tables that don't get deleted (and use DELETE instead of TRUNCATE).
However, in the long run, it would seem to make sense to address the
issue directly so DBA's and developers don't have to deal with it.
possible solutions come to mind, and I'm sure there are others.
1. Provide an option with ANALYZE to force it to work as if a table had
a minimum number of rows (e.g., ANALYZE MINIMUM 1000 would analyze
tables as if they all had at least 1000 rows).
2. Provide an option during table creation to state the minimum number
of rows to use for ANALYZE.
3. Just change ANALYZE to assume that all tables might have a reasonable
number of rows at some point even if they are empty now. (How much
performance is actually gained currently when ANALYZE updates the stats
for an empty table?)
In any case, it is hard to see how the present behaviour can be seen as
desirable. It obviously causes problems at least for new Postgres
users, and we
all hope there will be many more of these folks in the future. Thanks
for considering this. Mark
pgsql-general by date
|Next:||From: Woodchuck Bill||Date: 2004-11-29 20:01:26|
|Subject: Re: PGSQL: The Gateway will be kept.|
|Previous:||From: Joachim Zobel||Date: 2004-11-29 18:10:35|
|Subject: Triggers and INHERITS|