a_ogawa <a_ogawa(at)hi-ho(dot)ne(dot)jp> writes:
> I made a patch for "Cache last known per-tuple offsets to speed
> long tuple access" that is in TODO list.
> I referred URL below for implementation.
This wasn't quite what I had in mind. In particular I think it's a
bad idea to put a significant part of the implementation directly into
ExecEvalVar, because we'll have to duplicate that code anywhere else
that proves to be a hot-spot. I think the right way is to make a new
function that is just like heap_getattr except it is given a
TupleTableSlot instead of separate tuple and tupdesc arguments, and
encapsulate the knowledge about optimizing access using previously
determined offsets inside that.
Also, heap_deformtuple is a hot spot in its own right now, so slowing
it down so that it can share code with the TupleTableSlot optimization
doesn't seem to me like a win. There's not that much code in it, and
the data structures involved are very unlikely to change much, so I think
it'd be better to just duplicate the code and then optimize separately.
This would also avoid the confusion you've introduced into
heap_deformtuple about the roles of the various variables (the first
comment is effectively backwards now...)
We did not have heap_deformtuple in its current incarnation at the time
of the discussion you reference, so there may be some additional thought
needed. It'd be nice if heap_deformtuple could make use of
already-computed info, for instance. I'm not sure if that can be done
without fairly wide-ranging API changes (most of its callers don't have
Slots available), so it may be something to leave for a future patch.
Lastly, I don't see any point in introducing DeformTupleState and
deformtuple_cache as abstractions in their own right. I think it'd
be simpler and more readable just to regard all this as fields of a
TupleTableSlot. If there were any other contexts that we might want
to cache partial tuple disassembly info in, then it'd be worth making
a separation, but I don't foresee any such application.
BTW, why is it that your profile shows *more* calls to
heap_deformtuple_incr after the patch than there were nocachegetattr
calls before? Seems like there should be fewer, or at least the same
number. I'm now wondering whether there is any point to the "incremental"
aspect at all, as opposed to just doing a heap_deformtuple call the
first time a column value is demanded and always extracting all the
fields at that time. (Of course, this will always look like a win in
the context of test cases that access most or all of the fields. What
we need to worry about is the penalty incurred in cases that don't.)
regards, tom lane
In response to
pgsql-patches by date
|Next:||From: Tom Lane||Date: 2004-10-30 18:26:19|
|Subject: Re: Win32 open items |
|Previous:||From: Tom Lane||Date: 2004-10-30 16:53:36|
|Subject: Re: [PATCHES] ARC Memory Usage analysis |