Re: Getting the length of varlength data using PG_DETOAST_DATUM_SLICE

From: Mark Dilger <pgsql(at)markdilger(dot)com>
To: Jeremy Drake <pgsql(at)jdrake(dot)com>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Getting the length of varlength data using PG_DETOAST_DATUM_SLICE
Date: 2006-02-12 00:13:44
Message-ID: 43EE7DB8.2000304@markdilger.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jeremy Drake wrote:
> It looks like pg_column_size gives you the actual size on disk, ie after
> compression.
>
> What looks interesting for you would be byteaoctetlen or the function it
> wraps, toast_raw_datum_size. See src/backend/access/heap/tuptoaster.c.
> pg_column_size calls toast_datum_size, while byteaoctetlen/textoctetlen
> calls toast_raw_datum_size.
>
>
>
> On Sat, 11 Feb 2006, Bruce Momjian wrote:
>
>
>>Have you looked at the 8.1.X buildin function pg_column_size()?
>>
>>---------------------------------------------------------------------------
>>
>>Mark Dilger wrote:
>>
>>>Hello, could anyone tell me, for a user contributed variable length data type,
>>>how can you access the length of the data without pulling the entire thing from
>>>disk? Is there a function or macro for this?
>>>
>>>As a first cut, I tried using the PG_DETOAST_DATUM_SLICE macro, but to no avail.
>>> grep'ing through the release source for version 8.1.2, I find very little
>>>usage of the PG_GETARG_*_SLICE and PG_DETOAST_DATUM_SLICE macros (and hence
>>>little clue how they are intended to be used.) The only files where I find them
>>>referenced are:
>>>
>>> doc/src/sgml/xfunc.sgml
>>> src/backend/utils/adt/varlena.c
>>> src/include/fmgr.h
>>>
>>>
>>>I am writing a variable length data type and trying to optimize the disk usage
>>>in certain functions. There are cases where the return value of the function
>>>can be determined from the length of the data and a prefix of the data without
>>>fetching the whole data from disk. (The prefix alone is insufficient -- I need
>>>to also know the length for the optimization to work.)
>>>
>>>The first field of the data type is the length, as follows:
>>>
>>> typedef struct datatype_foo {
>>> int32 length;
>>> char data[];
>>> } datatype_foo;
>>>
>>>But when I fetch the function arguments using
>>>
>>> datatype_foo * a = (datatype_foo *)
>>> PG_DETOAST_DATUM_SLICE(PG_GETARG_DATUM(0),0,BLCKSZ);
>>>
>>>the length field is set to the length of the fetched slice, not the length of
>>>the data as it exists on disk. Is there some other function that gets the length
>>>without pulling more than the first block?
>>>
>>>Thanks for any insight,
>>>
>>>--Mark
>>>
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 1: if posting/reading through Usenet, please send an appropriate
>>> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>>> message can get through to the mailing list cleanly
>>>

Ok, for anyone following the thread, this code works for me:

int true_size_arg_zero = toast_raw_datum_size(PG_GETARG_DATUM(0));
int true_size_arg_one = toast_raw_datum_size(PG_GETARG_DATUM(1));

Be sure to #include "access/tuptoaster.h"

Thanks Jeremy!

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-02-12 00:21:45 Re: [HACKERS] Spaces in psql output (Was: FW: PGBuildfarm member snake Branch HEAD Status changed)
Previous Message Martijn van Oosterhout 2006-02-11 23:26:55 Re: [HACKERS] Spaces in psql output (Was: FW: PGBuildfarm member snake Branch HEAD Status changed)