Re: Extensions User Design

From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Extensions User Design
Date: 2009-06-23 21:06:49
Message-ID: DC211259-E79B-4516-9D60-8E19383C210D@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le 23 juin 09 à 20:30, David E. Wheeler a écrit :

> On Jun 23, 2009, at 10:44 AM, Dimitri Fontaine wrote:
>> - support for home grown SQL/PLpgSQL only extensions in order to
>> make life
>> easier for in-house PG based development (you don't have to code
>> in C to
>> benefit from extensions)
>
> It'd be nice if it supported other core languages like PL/Perl, but
> it's okay if it doesn't on the first round (I'd likely want to use
> some CPAN modules in a PL/Perl extension, anyway).

At first sight I though you were talking about a non-issue, as I meant
that an extension should not have to be a .so (or dll) + a .sql
exposing it, but any SQL code PostgreSQL is able to understand, plperl
included.

But plpgsql and plperl are not available by default on databases, so
it makes sense to rise the question, and the right answer might be to
expose some (optional?) core components as extensions (version is PG
major version), in order for out-of-core extensions to be able to
depend on them being there.

>> - support for procedural languages (a priori easily covered within
>> basic
>> modules but I'm not sure) like plproxy, pllolcode, pllua,
>> plscheme, plsh
>> et al.
>
> Oh, here it is. So this goes with the point above, and can be
> simplified to "support all procedural languages," yes?

I've been told pljava is complex in that it requires a part of pljave
to be there in order to be installable (like pseudo DDL coded in
pljava and needed in the installation procedure). So I'd prefer not to
go this far, just in case.

>> - supporting more than one version of the same module installed in
>> the same
>> time, possibly (I suppose always but...) in different schemas
>
> Eh. This could be in 2.0 I think.

Yeah, my point exactly.

>> - custom variables?
>
> You mean GUC variables? That'd certainly be useful, but again,
> probably not necessary for 1.0.

In fact supporting custom classes GUCs seems to be part of what Tom
Dunstan did, so it should be ok to plan to have it?
http://archives.postgresql.org/message-id/ca33c0a30804061349s41b4d8fcsa9c579454b27ecd2@mail.gmail.com

> Well, PostGIS is itself an extension, no? What we need, then, is
> dependency tracking.

In fact PostGIS is a complex beast, in that it's registering typmod
like information about user columns into its own private tables (think
extension's catalog). Now that means circular dependancy of some sort
as restoring user data requires to have the PostGIS private tables
filled already, but as they refer user data (not sure if it's
targeting user tables ---DDL--- only), you have to already have
restored them.

Even if it's only targeting schema level stuff, you'd need to restore
the extension's data after the schema but before the data, but the
extension's itself (data types, indexes opclass, etc) BEFORE the data.

I'm not sure you should target to support this level of complexity (it
has to be generic) in the first incantation of it, but if some hacker
tells me it's damn easy to get right with pg_depend, why not?

>> - a core team approved list of extensions (replacing contribs,
>> maybe adding
>> to it), where approved means code has been reviewed and the only
>> reason
>> why it's not in the core itself is that core team feels that it's
>> not
>> part of a RDBMS per-se, or feel like the code should be maintained
>> and
>> released separately until it gets some more field exposure... (think
>> plproxy).
>
> I hate the idea of "approved" extensions, but would love to see a
> kind of "standard library" as a separate distribution that contains
> a bunch of stuff that's commonly used. I'd want to steer clear of
> blessing by the core team other than that, though, because then you
> start to get into politics.

Maybe it's just a (non native) misuse of vocabulary, I see contrib as
the current incarnation of the standard extension library and would
like to see it evolve into a list of reviewed and maintained
extensions, which in a later step you'll be able to remotely fetch and
install easily from source from postgresql.org services, or in binary
from your distribution package.

But I think we'll still need a contrib/ like suite that core hackers
keep an eye on and maintain in minor branches and adapt in major
releases.

Now if we ever get to a point where we can setup an http repository of
easily installable extensions that you can point a built-in core tool
to, that means there will be the standard official one and a myriad of
others (pgfoundry, and self hosting).

>> * later please
> Yes, this would be nice. Also, integrated testing as with CPAN. I
> happen to know of a really nice test framework we could use…

hehe

>> - complex support for ad-hoc bootstrap of uncommon modules such as
>> pljava
>
> Not sure what this means; can you provide more detail?

See above.

>> - dependancy graph solving and automatic installation, with depends,
>> recommends and suggest sections and with rules/setup to choose
>> what to
>> pull in by default...
>
> We'd likely have to store this information in some sort of system
> table, too, yes?

Yes, that'd be part of the extension "meta data".

>> Extensions will need metadata, and after reading several proposals,
>> what I
>> propose here is to have a first explicit step to register the
>> extension name
>> and metadata, then have "basic" tools to play with it.
>
> Register with whom? I have to say that, although there is namespace
> registration for CPAN, it's not required, and this is, in fact, a
> big part of the reason for CPAN's success. There is no approval
> process barrier to entry.

None of this, stay aboard :)
Register within the database where you'll want to install it. The
install step as shown below will then use the meta-data to do the
sanity checking (dependancies) and the installation (what script to
read?).

>> === installing and removing an extension
>>
>> begin;
>> install extension foo with search_path = foo;
>> commit;
>
> It would need something to ensure an appropriate version, too, no?

So it's:
create schema foo;
install extension foo with version = 1.2, search_path = foo;

That's fine by me, but I'm not sure whether first extension's
implementation will support installing several versions of the same
extension in parallel, so I'm unsure what we get here... one more
sanity check? I buy it.

>> Extensions authors are asked not to bother about search_path in
>> their sql
>> scripts so that it's easy for DBAs to decide where to install them.
>> The with
>> strange syntax is there to allow for the "install extension"
>> command to
>> default to, e.g., pg_extension, which won't typically be the first
>> schema in
>> the search_path.
>
> And how will functions that call other functions within an extension
> know that they're calling those functions in the appropriate schema?
> I get this all the time with pgTAP: You can install it in its own
> schema, but you have to include that schema in the search_path in
> order for it to work, as some pgTAP functions call other pgTAP
> functions with no schema-qualification.

I don't think we want to cancel user ability to choose schema where to
install, so an idea could be to ask extensions author to
systematically use pg_extension (or non-qualify), and PostgreSQL could
replace this with the INSTALL EXTENSION command schema.

>> begin;
>> drop extension foo [cascade];
>> commit;
>>
>> The "cascade" option is there to care about reverse depends.
>
> Would it fail if there were dependencies on the module in the
> database, such as functions that use its functions, or tables that
> depend on a custom data type?

Yes, when you don't use the CASCADE keyword.

>> === creating extensions (authoring)
>>
>> The 'foo' extension author is meant to provide a +foo.sql+ file
>> containing
>> this:
>>
>> create extension foo
>> with version 1.0
>> install [script] 'foo.install.sql'
>> uninstall [script] 'foo.uninstall.sql'
>> upgrade function upgrade_foo(old version, new version)
>> [ custom_variable_classes 'a,b'
>> configuration file 'foo.conf' ]
>> depends on bar version 0.3
>> and on baz version >= 1.2;
>>
>> Here we suppose we have also a new datatype "version" to host the
>> versionning information, with the associated operators. See
>> http://packages.debian.org/sid/postgresql-8.3-debversion
>
> I like this. Then the build file contains, essentially, just a SQL
> command. That will make it easy for extension authors. However, they
> might wish to include quite a lot of other metadata for the
> extension, such as URLs for VC and bug tracking.

I guess it'll get easy to add those once we agree on the way to go here.

>> Doing it this way, we skip the need to provide a way of telling "next
>> comands are meant for creating SQL objects which belongs to such
>> extension",
>> at the expense of forcing authors to manage upgrades to add objects.
>
> The install and uninstall script attributes should also allow either
> full paths or, if just a simple file name, paths to the extensions
> installation directory (currently $PGSQL/share/contrib).

Sold, with current privileges and location restrictions about file
system access from within the database... does this boils down to
$PGDATA subdirectory only?

>> The upgrade function is mandatory, and has to return the installed
>> version
>> or null, meaning "please run the install script again, that's how I
>> upgrade". The error management is to be made by means of RAISE
>> EXCEPTION.
>
> I'm not following you here. If I have a bunch of releases with a
> number of changes to them, this function could get quite complex, I
> should think. Also, in what language could it be written?

It'll get as complex as you need it to be, and it's only required that
it's a PostgreSQL function. I guess writing the plphp upgrade function
in plphp would be quite challenging, unless we're able to guarantee
that the newer extension's code won't get loaded before until the
fonction returned (and didn't RAISE EXCEPTION).

Regards,
--
dim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2009-06-23 21:41:43 Re: Extensions User Design
Previous Message Devrim GÜNDÜZ 2009-06-23 21:03:07 Re: 8.4RC2 is available