Re: [PATCH] Add transforms feature

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Add transforms feature
Date: 2013-11-20 16:51:44
Message-ID: 528CE8A0.3060207@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/15/13, 11:04 AM, Dimitri Fontaine wrote:
> - Documentation style seems to be to be different from the "man page"
> or "reference docs" style that we use elsewhere, and is instead
> deriving the general case from examples. Reads strange.

Which specific section do you have in mind? It's hard to explain this
feature in abstract terms, I think.

> - The internal datatype argument and return type discussion for
> function argument looks misplaced, but I don't have a better
> proposition for that.

OK, maybe I'll put that in parentheses or a separate paragraph.

> - Do we need an ALTER TRANSFORM command?
>
> Usually we have at least an Owner for the new objects and a command
> to change the owner. Then should we be able to change the
> function(s) used in a transform?

We don't have ALTER CAST either, and no one's been too bothered about
that. It's possible, of course.

> - Should transform live in a schema?
>
> At first sight, no reason why, but see next point about a use case
> that we might be able to solve doing that.

Transforms don't have a name, so I don't quite see what you mean here.

> - SQL Standard has something different named the same thing,
> targetting client side types apparently. Is there any reason why we
> would want to stay away from using the same name for something
> really different in PostgreSQL?

Let's review that, as there as been some confusion about that. The SQL
standard syntax is

CREATE TRANSFORM FOR <type> <groupname> (...details...);

and then there is

SET DEFAULT TRANSFORM GROUP <groupname>
SET TRANSFORM GROUP FOR TYPE <type> <groupname>

This is essentially an elaborate way to have custom input/output
formats, like DateStyle or bytea_output.

(You can find examples of this in the IBM DB2 documentation. Some of
their clients apparently set a certain transform group automatically,
allowing you to set per-interface output formats.)

The proposed syntax in the other hand is

CREATE TRANSFORM FOR <type> LANGUAGE <lang> (...details...);

So you could consider LANGUAGE <lang> to be the implicit transform group
of language <lang>, if you like.

Or you could consider that this is a situation like VIEW vs.
MATERERIALIZED VIEW: they sound the same, they are a bit alike, but the
implementation details are different.

All obvious synonyms of "transform" (conversion, translation, etc.) are
already in use.

> On the higher level design, the big question here is about selective
> behavior. As soon as you CREATE TRANSFORM FOR hstore LANGUAGE plperl
> then any plperl function will now receive its hstore arguments as a
> proper perl hash rather than a string.
>
> Any pre-existing plperl function with hstore arguments or return type
> then needs to be upgraded to handle the new types nicely, and some of
> those might not be under the direct control of the DBA running the
> CREATE TRANSFORM command, when using some plperl extensions for example.

I had proposed disallowing installing a transform that would affect
existing functions. That was rejected or deemed unnecessary. You can't
have it both ways. ;-)

> A mechanism allowing for the transform to only be used in some functions
> but not others might be useful. The simplest such mechanism I can think
> of is modeled against the PL/Java classpath facility as specified in the
> SQL standard: you attach a classpath per schema.

Anything that's a problem per-database would also be a problem per schema.

> Should using the schema to that ends be frowned upon, then we need a way
> to register each plperl function against using or not using the
> transform facility, defaulting to not using anything. Maybe something
> like the following:
>
> CREATE FUNCTION foo(hash hstore, x ltree)
> RETURNS hstore
> LANGUAGE plperl
> USING TRANSFORM FOR hstore, ltree
> AS $$ … $$;

This is a transition problem. Nobody is required to install the
transforms into their existing databases. They probably shouldn't.

How many people actually use hstore with PL/Perl or PL/Python now?
Probably not many, because it's weird.

I like to think about how this works for new development: Here is my
extension type, here is how it interfaces with languages. Once you have
established that, you don't want to have to repeat that every time you
write a function. That's error prone and cumbersome. And anything
that's set per schema or higher is a dependency tracking and caching mess.

Also, extension types should work the same as built-in types.
Eventually, I'd like to rip out the hard-coded data type support in
PL/Python and replace it with built-in transforms. Even if we don't
actually do it, conceptually it should be possible. Now if we require
"USING TRANSFORM FOR int, bytea" every time, we'd have taken a big step
back. Effectively, we already have built-in transforms in PL/Python.
We have added a few more over the years. It's been a bit of a pain from
time to time. At least, with this feature we'd be moving this decision
into user space and give people a way to fix things. (Incidentally, if
you add a lot of transforms, you are probably dealing with a strongly
typed language. And a strongly typed language is more likely to cleanly
catch type errors resulting from changes in the transforms.)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2013-11-20 17:02:44 Re: GIN improvements part 1: additional information
Previous Message Andres Freund 2013-11-20 16:46:22 Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1