I played with this tonight writing a small insert/update routine and
frequent vacuums. Here is what I came up with ( (PostgreSQL) 7.2.1 )
Once a table reaches a certain size, it will not auto shrink. This
behavior is common for performance reasons. The way to re-use space is
to run the vacuum which will mark deleted tuples as unused. Once your
table has unused tuples, inserts and updates will go into the unused
tuples areas before additional space is allocated. ( there may be some
additional restrictions on this--like does the new tuple fit and not
cross page boundaries--but I don't know enough to say exactly) .
In addition, max_fsm_pages has an impact on how many pages will be
available to be marked as re-usable. If you have a huge table and
changes are impacting more than the default 10,000 pages this is set to,
you will want to bump this number up. My problem was I saw my UnUsed
tuples always growing and not being re-used until I bumped this value
up. As I watched the vacuum verbose output each run, I notices more
than 10k pages were in fact changing between vacuums.
Your table will grow to an optimal size until the the amount of
inserts,updates,deletes balances out. Of course, if you insert more
than you delete, your database will always grow. But, if the number of
tuples at some point becomes fixed in the table, and you vacuum enough
in between all the inserts and updates, your table size will stop growing.
If you are vacuuming 4 times a day, yet the amount of tuples inserted
and updated exceeds the current UnUsed tuples, your table will continue
to grow. With the addition of the lazy vacuum, there should be no
problem increasing the frequency of your vacuums to accomodate the
excessive changes being done to the table. Get the changed tuples
marked unused as often as you can so as not to create a new tuple with
an update or insert.
Just how big are your tables getting? How often are you vacuuming now,
and how many tuples are marked as UnUsed and how many tuples are
changing between vacuums?
I have a large table with 20+million records which flattend out around
3.5 gigs. It only really gets updated once each day with a vacuum
following. Since I add to the table daily, i expect some growth. I
noticed the problem with this table growing to be 2x that size when the
unused tuples would not go down, but kept growing each day. Ager
changing max_fsm_pages like I mentioned above, the unused tuples stays
fairly fixed and the database only grows a little each day as expected.
Hope this helps get you out of trouble.
Marcos Garcia wrote:
>Since this conversation had started, the presented solutions for the
>disk space (disk space of tables and indexes) problem were:
> - reindex -> no space recovered
> - frequent vacuums -> some space recovered, meanwhile the database
> - vacuumm full -> some space recovered, meanwhile the database keeps
>growing, quite similar to simple vacuum. we have also to keep in mind,
>that the option "full", makes locks to the tables. Therefore, the
>services around the database locks too, and with the growing of the
>database the time spent for "vacumm full" increases, as well as the
>downtime of the services around the database.
>So, my doubts are:
> - There's nothing we can do to avoid the growth of the database, only
>slow down that growth.
> - We, application developers that use postgresql databases have to
>count with this problem.
>Sorry if i'm being a little rude, but i'm in a real trouble.
>Thanks in advance,
>On Wed, 2002-07-24 at 15:04, Robert Treat wrote:
>>This is before my morning coffee, so someone please correct me if I am
>>wrong, but that should be the amount of space in your table that is
>>available for reuse before new pages need to be added. Whether it is
>>actually used is determined by a combination of factors including the
>>amount of insert/update activity on your system and the size of your
>>free space map. If you want to recover all of that space, you'll need to
>>do a vacuum full. As it looks now you could probably start vacuuming
>>this table more frequently to keep this number from growing so quickly.
>>On Wed, 2002-07-24 at 07:14, Gaetano Mendola wrote:
>>>"Michael G. Martin" <michael(at)vpmonline(dot)com> wrote:
>>>>Check this value in the postgresql.con file:
>>>>max_fsm_pages = 100000
>>>>I had the same problem with the db growing, index no longer being used,
>>>>despite vacuums each night. Somewhere, there is a thread on this.
>>>>Anyway, If you look at the vacuum stats each time your run vacuum, looks
>>>>to see how many pages are being updated between vacuums--i looked at the
>>>>removed x tuples in y pages value. Then, set this value to be greater
>>>>than the number of pages changed between vacuums. If more pages are
>>>>being updated between vacuums than what max_fsm_pages allows, the extra
>>>>pages won't be marked to be re-used--from what I understand. This then
>>>>results in the db growing and the optimizer starts to chose full table
>>>>scans since the db spans so many pages on the disk--at least this is
>>>>what happened in my db.
>>>Can you explain me this line that I obatin in the log
>>>after a vacuum analyze ?
>>> --Relation ua_user_data_exp--
>>>2002-07-21 05:00:02  DEBUG: Pages 1402: Changed 2, reaped 1192,
>>>Empty 0, New 0; Tup 4277: Vac 16207, Keep/VTL 0/0, Crash 0, UnUsed 1, MinLen
>>>393, MaxLen 680; Re-using: Free/Avail. Space 9148944/9141356;
>>>EndEmpty/Avail. Pages 0/1191. CPU 0.00s/0.03u sec.
>>>I'm wondering about "Re-using: Free/Avail. Space 9148944/9141356"
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 6: Have you searched our list archives?
>>---------------------------(end of broadcast)---------------------------
>>TIP 4: Don't 'kill -9' the postmaster
In response to
pgsql-admin by date
|Next:||From: Tom Lane||Date: 2002-07-25 15:31:53|
|Subject: Re: Maximum Number of Tables in a database |
|Previous:||From: Brubaker, Shane||Date: 2002-07-25 00:29:38|
|Subject: Maximum Number of Tables in a database|