Re: Free-space-map management thoughts

From: Stephen Marshall <smarshall(at)wsi(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Free-space-map management thoughts
Date: 2003-02-27 22:10:01
Message-ID: 3E5E8CB9.3030608@wsi.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

>Stephen Marshall <smarshall(at)wsi(dot)com> writes:
>
>
>>2. The histogram concept is a neat idea, but I think some reorganization
>>of the page information might make it unnecessary. Currently the FSM
>>pages are sorted by BlockNumber. This was particularly useful for
>>adding information about a single page, but since that interface is no
>>longer to be supported, perhaps the decision to sort by BlockNumber
>>should also be revisited.
>>
>>
>
>I was thinking about that, but we do still need to handle
>RecordAndGetFreeSpace --- in fact that should be the most common
>operation. The histogram approximation seems an okay price to pay for
>not slowing down RecordAndGetFreeSpace. If you wanted to depend on
>the ordering-by-free-space property to any large extent,
>RecordAndGetFreeSpace would actually have to move the old page down in
>the list after adjusting its free space :-(
>
>
I hadn't considered the needs of RecordAndGetFreeSpace. It is called so
much more than MultiRecordFreeSpace that it make much better sense to
optimize it, and hence organize the page information by BlockNumber.

I think you just sold me on the histogram idea :) but I still have
some thoughts about its behavior in the oversubscribed state.

If I understand the concept correctly, the histogram will only be
calculated when MultiRecordFreeSpace is called AND the FSM is
oversubscribed. However, when it is called, we will need to calculate a
histogram for, and potentially trim data from, all relations that have
entries in the FSM.

When vacuuming the entire database, we will end up with an N-squared
loop where we iterate over all the relations in vacuum, and iterate over
them again in each call to MultiRecordFreeSpace that occurs within each
vacuum. If each relation consistantly requests the storage of the same
amount of page info during each vacuum, the extra work of this N-squared
loop will probably disappear after the system settles into an
equilibrium, but inconsistant requests could cause more oscillations in
the free space adjustment.

Do I understand how this will work properly, or did I miss something?

In any event, I don't really think this is a problem, just something to
pay attention to. It also highlights the need to make the histogram
calculation and free space adjustment as efficient as possible.

By-the-way, I think your other suggestions are great (e.g. changes to
the public API, maintaining more internal statics, reporting more info
in VACUUM VERBOSE, ensuring that a minimum amout of freespace info is
retained for all relations). I think this will be a nice improvement to
how postgres reclaims disk space.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2003-02-27 22:10:25 Re: Free-space-map management thoughts
Previous Message Tom Lane 2003-02-27 20:12:36 Re: analyze after a database restore?