Re: Dead Space Map for vacuum

From: Russell Smith <mr-russ(at)pws(dot)com(dot)au>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, ITAGAKI Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dead Space Map for vacuum
Date: 2006-12-29 22:22:37
Message-ID: 4595952D.3040106@pws.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs wrote:
> On Fri, 2006-12-29 at 10:49 -0500, Tom Lane wrote:
>
>> "Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
>>
>>> I would suggest that we tracked whether a block has had 0, 1 or 1+
>>> updates/deletes against it. When a block has 1+ it can then be
>>> worthwhile to VACUUM it and to place it onto the FSM. Two dead tuples is
>>> really the minimum space worth reclaiming on any block.
>>>
>> How do you arrive at that conclusion?
>>
>
> FSM code ignores any block with less space than 1 average tuple, which
> is a pretty reasonable rule.
>
FSM serves a different purpose than DSM and therefore has an entirely
different set of rules governing what it should and shouldn't be doing.
This is a reasonable rule for FSM, but not for DSM.
> If you only track whether a block has been updated, not whether it has
> been updated twice, then you will be VACUUMing lots of blocks that have
> only a 50% chance of being usefully stored by the FSM. As I explained,
> the extra bit per block is easily regained from storing less FSM data.
>
Well, it seems that when implementing the DSM, it'd be a great time to
move FSM from it's current location in Shared Memory to somewhere else.
Possibly the same place as DSM. A couple of special blocks per file
segment would a good place. Also I'm not sure that the point of
VACUUMing is always to be able be able to immediately reuse the space.
There are cases where large DELETE's are done, and you just want to
decrease the index size. In Tom's counter example of large tuples, you
certainly want to vacuum the index when only a single update/delete occurs.
> My understanding was that DSM was meant to increase VACUUM efficiency,
> so having a way to focus in on blocks most worth vacuuming makes sense
> using the 80/20 rule.
>
Possibly true. I don't have anything to indicate what usage patterns
produce what requirements in vacuum patterns. If there are significant
numbers of blocks with one update, is it a loss to actually vacuum
those. I know it could be faster if we didn't, but would it still be
faster than what we do now.
>
>> Counterexample: table in which all tuples exceed half a page.
>>
>
> Current FSM code will ignore those too, if they are less than the
> average size of the tuple so far requested. Thats a pretty wierd
> counterexample, even if it is a case that needs handling.
>
Again I'd be careful saying that FSM = DSM for handling of what should
be done for a particular block.

Russell Smith.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2006-12-29 22:49:49 Re: Dead Space Map for vacuum
Previous Message Simon Riggs 2006-12-29 22:18:14 Re: Dead Space Map for vacuum