Re: plperl vs. bytea

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tino Wildenhain <tino(at)wildenhain(dot)de>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org, Theo Schlossnagle <jesus(at)omniti(dot)com>
Subject: Re: plperl vs. bytea
Date: 2007-05-07 17:57:25
Message-ID: 463F6885.8000809@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>> Tino Wildenhain wrote:
>>
>>> Andrew Dunstan schrieb:
>>>
>>>> This does not need to be over-engineered, IMNSHO.
>>>>
>>> Well could you explain where it would appear over-engineered?
>>>
>
>
>> Anything that imposes extra requirements on type creators seems undesirable.
>>
>
>
>> I'm not sure either that the UUID example is a very good one. This whole
>> problem arose because of performance problems handling large gobs of
>> data, not just anything that happens to be binary.
>>
>
> Well, we realize that bytea has got a performance problem, but are we so
> sure that nothing else does? I don't want to stick in a one-purpose
> wart only to find later that we need a few more warts of the same kind.
>
> An example of something else we ought to be considering is binary
> transmission of float values. The argument in favor of that is not
> so much performance (although text-and-back conversion is hardly cheap)
> as it is that the conversion is potentially lossy, since float8out
> doesn't by default generate enough digits to ensure a unique
> back-conversion.
>
> ISTM there are three reasons for considering non-text-based
> transmission:
>
> 1. Performance, as in the bytea case
> 2. Avoidance of information loss, as for float
> 3. Providing a natural/convenient mapping to the PL's internal data types,
> as we already do --- but incompletely --- for arrays and records
>
> It's clear that the details of #3 have to vary across PLs, but I'd
> like it not to vary capriciously. For instance plperl currently has
> special treatment for returning perl arrays as SQL arrays, but AFAICS
> from the manual not for going in the other direction; plpython and
> pltcl overlook arrays entirely, even though there are natural mappings
> they could and should be using.
>
> I don't know to what extent we should apply point #3 to situations other
> than arrays and records, but now is the time to think about it. An
> example: working with the geometric types in a PL function is probably
> going to be pretty painful for lack of simple access to the constituent
> float values (not to mention the lossiness problem).
>
> We should also be considering some non-core PLs such as PL/Ruby and
> PL/R; they might provide additional examples to influence our thinking.
>

OK, we have a lot of work to do here, then.

I can really only speak with any significant knowledge on the perl
front. Fundamentally, it has 3 types of scalars: IV, NV and PV (integer,
float, string). IV can accomodate at least the largest integer or
pointer type on the platform, NV a double, and PV an arbitrary string of
bytes.

As for structured types, as I noted elsewhere we have some of the work
done for plperl. My suggestion would be to complete it for plperl and
get it fully orthogonal and then retrofit that to plpython/pltcl.

I've actually been worried for some time that the conversion glue was
probably imposing significant penalties on the non-native PLs, so I'm
glad to see this getting some attention.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2007-05-07 17:59:10 Re: autovacuum starvation
Previous Message Tom Lane 2007-05-07 17:34:42 Re: plperl vs. bytea