extensions are hitting the ceiling

From: Eric Hanson <eric(at)aquameta(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: extensions are hitting the ceiling
Date: 2019-03-19 02:38:19
Message-ID: CACA6kxjhMM0-Vcb+foF4+n+kHW4kCSzOuK5hSuKqoUzqdfZ9qw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi folks,

After months and years of really trying to make EXTENSIONs meet the
requirements of my machinations, I have come to the conclusion that either
a) I am missing something or b) they are architecturally flawed. Or
possibly both.

Admittedly, I might be trying to push extensions beyond what the great
elephant in the sky ever intended. The general bent here is to try to
achieve a level of modular reusable components similar to those in
"traditional" programming environments like pip, gem, npm, cpan, etc.
Personally, I am trying to migrate as much of my dev stack as possible away
from the filesystem and into the database. Files, especially code,
configuration, templates, permissions, manifests and other development
files, would be much happier in a database where they have constraints and
an information model and can be queried!

Regardless, it would be really great to be able to install an extension,
and have it cascade down to multiple other extensions, which in turn
cascade down to more, and have everything just work. Clearly, this was
considered in the extension architecture, but I'm running into some
problems making it a reality. So here they are.

#1: Dependencies

Let's say we have two extensions, A and B, both of which depend on a third
extension C, let's just say C is hstore. A and B are written by different
developers, and both contain in their .control file the line

requires = 'hstore'

When A is installed, if A creates a schema, it puts hstore in that schema.
If not, hstore is already installed, it uses it in that location. How does
the extension know where to reference hstore?

Then, when B is installed, it checks to see if extension hstore is
installed, sees that it is, and moves on. What if it expects it in a
different place than A does? The hstore extension can only be installed
once, in a single schema, but if multiple extensions depend on it and look
for it in different places, they are incompatible.

I have heard talk of a way to write extensions so that they dynamically
reference the schema of their dependencies, but sure don't know how that
would work if it's possible. The @extschema@ variable references the
*current* extension's schema, but not there is no dynamic variable to
reference the schema of a dependency.

Also it is possible in theory to dynamically set search_path to contain
every schema of every dependency in play and then just not specify a schema
when you use something in a dependency. But this ANDs together all the
scopes of all the dependencies of an extension, introducing potential for
collisions, and is generally kind of clunky.

#2: Data in Extensions

Extensions that are just a collection of functions and types seem to be the
norm. Extensions can contain what the docs call "configuration" data, but
rows are really second class citizens: They aren't tracked with
pg_catalog.pg_depend, they aren't deleted when the extension is dropped,
etc.

Sometimes it would make sense for an extension to contain *only* data, or
insert some rows in a table that the extension doesn't "own", but has as a
dependency. For example, a "webserver" extension might contain a
"resource" table that serves up the content of resources in the table at a
specified path. But then, another extension, say an app, might want to just
list the webserver extension as a dependency, and insert a few resource
rows into it. This is really from what I can tell beyond the scope of what
extensions are capable of.

#3 pg_dump and Extensions

Tables created by extensions are skipped by pg_dump unless they are flagged
at create time with:

pg_catalog.pg_extension_config_dump('my_table', 'where id < 20')

However, there's no way that I can tell to mix and match rows and tables
across multiple extensions, so pg_dump can't keep track of multiple
extensions that contain rows in the same table.

I'd like an extension framework that can contain data as first class
citizens, and can gracefully handle a dependency chain and share
dependencies. I have some ideas for a better approach, but they are pretty
radical. I thought I would send this out and see what folks think.

Thanks,
Eric
--
http://aquameta.org/

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2019-03-19 02:40:41 Re: Problem with default partition pruning
Previous Message Tatsuro Yamada 2019-03-19 02:02:57 Re: [HACKERS] CLUSTER command progress monitor