Re: [PATCH] plpythonu datatype conversion improvements

From: Caleb Welton <cwelton(at)greenplum(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <gsstark(at)mit(dot)edu>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] plpythonu datatype conversion improvements
Date: 2009-08-22 16:44:00
Message-ID: C6B56E60.2FA6%cwelton@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I didn't say that it _only_ affects bytea, I said that was the _primary motivation_ for it.

Converting from postgres=>python this change affects boolean, float4, float8, numeric, int16, int32, int64, text, and bytea. The code to handle this goes through DatumGetXXX for the native C type for the datatype, with the exception of the Varlena types (special case) and Numeric which calls numeric_float8() to convert the numeric to a native C double precision float. As mentioned in the original post I do not think that this is appropriate for numeric, and I would prefer a better mapping, but this was a pre-existing issue and is not a change in behavior for the patch. Since this is a separate issue I opted not to change it to keep the patch concise.

Converting from python=>postgres this change effects void, bool, bytea, and text.

The reason for this asymmetry is that there is not a 1:1 mapping of Postgres datatypes to Python datatypes and conciseness of the patch.

All other datatypes (including arrays unfortunately) go through the same text input functions that they did before.

Of the above I would expect the only type that we would have good reason to expect to change would be numeric, and this patch _doesn't_ rely on it's internal representation: it calls numeric_float8().

I think it would be good to have mappings for other datatypes, depending on internal representation or not, but thought that was beyond the scope of the patch.

Regards,
Caleb

On 8/22/09 7:03 AM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

Greg Stark <gsstark(at)mit(dot)edu> writes:
> On Sat, Aug 22, 2009 at 11:45 AM, Caleb Welton<cwelton(at)greenplum(dot)com> wrote:
>> As documented in the patch, the primary motivation was support of BYTEA
>> datatype, which when cast through cstring was truncating python strings with
>> embedded nulls,
>> performance was only a secondary consideration.

> The alternative to attaching to the internal representation would be
> to marshal and unmarshal the text representation where nuls are
> escaped as \000.

I don't actually have a problem with depending on the internal
representation of bytea. What I'm unhappy about is that (despite
Caleb's assertions that this is only about bytea) the patch proceeds
to make plpython intimate with the internal representation of a bunch
of *other* datatypes, some of which we have good reason to think may
change in future. If it were only touching bytea I would not have
complained.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Roger Leigh 2009-08-22 18:13:32 Unicode UTF-8 table formatting for psql text output
Previous Message Tom Lane 2009-08-22 16:39:41 Re: Another try at reducing repeated detoast work for PostGIS