Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> Tom Lane wrote:
>> In fsm_rebuild_page, surely we needn't check "if (lchild < NodesPerPage)".
> Yes, we do.
But the loop starting point is such that you must be visiting a parent
with at least one child, no?
>> reveals a rather fundamental problem: it is clearly possible
>> for this test to fail on valid request sizes, because the page
>> header overhead is less than FSM_CAT_STEP (especially if BLCKSZ
>> is more than 8K). I'm not sure about a really clean solution
> Hmph. The other alternative is to use 2 bytes instead of one per page,
> and track the free space exactly. But I'd rather not do that just to
> deal with the very special case of huge requests.
Yeah, I thought about that too. It's got another problem besides the
sheer space cost: it would result in a whole lot more update traffic for
upper levels of the tree. The quantization of possible values in the
current design is good because it avoids updates of parents for
relatively small deltas of free space.
> Or we could just return -1 instead of throwing an error. Requests higher
> than the limit would then always have to extend the heap. That's not
> good, but I think we already have that problem for tuples of exactly
> MaxHeapTupleSize bytes. Since PageGetFreeSpace subtracts the size of a
> new line pointer, only newly extended pages that have never had any
> tuples on them have enough space, as determined by PagetGetFreeSpace, to
> fit a tuple of MaxHeapTupleSize bytes.
That seems like something we'll want to fix sometime, rather than
hardwiring into the FSM design.
I suppose an alternative possibility is to set MaxHeapTupleSize at
255/256's of a block by definition, so that no request will ever exceed
what the FSM stuff can handle. But I'm sure that'd make somebody
unhappy --- somewhere out there is a table with tuples wider than that.
Probably the least bad alternative here is to allow FSM's category
scaling to depend on MaxHeapTupleSize.
regards, tom lane
In response to
pgsql-hackers by date
|Next:||From: Dimitri Fontaine||Date: 2008-09-29 13:51:40|
|Subject: Re: parallel pg_restore - WIP patch|
|Previous:||From: Heikki Linnakangas||Date: 2008-09-29 13:30:04|
|Subject: Re: FSM rewrite: doc changes|