Re: Review: Dumping an Extension's Script

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Ali Dar <ali(dot)munir(dot)dar(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Review: Dumping an Extension's Script
Date: 2012-12-05 15:46:04
Message-ID: m2624gpkn7.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for your review!

Ali Dar <ali(dot)munir(dot)dar(at)gmail(dot)com> writes:
> 1) I don't know if community is really looking for this functionality, I
> followed the thread a bit and appraently there is a disagreement over it. I
> will not comments on that. Let the big guys decide if they really want this
> feature.

Let's open the extension feature to users who are not at ease with
building packages for their distribution of choice or our users who are
not granted root privileges on their database servers.

After all you don't even need to be a superuser to CREATE an EXTENSION.

> 2) I think comments are very limited, I would be really great if you can
> add thorough comments of whatever new that you have added. Couple of
> examples are given below:

I've added some comments in the attached new revision of the patch, and
fixed the errors you've been listing.

> ii) What is the need of "require" option?
> It would be great if you can go through all the comments and try to add the
> missing comments and improve the old ones.

When CREATE EXTENSION is given the extension's script in the SQL command
itself directly, the control files property are not to be found in a
separate file. Most of those properties relate to the packaging and the
reading of the file so we don't need them by construction, but it's not
the case with two of them.

The parameter "require" is used in the create extension code to be able
to set the search_path correctly before running the extension's script,
so that dependencies are found automatically.

The parameter "relocatable" ends up in the pg_extension catalogs for
being able to generate an ERROR if the user attempts to ALTER EXTENSION
… SET SCHEMA, so we need a way to have it from the SQL command now (so
that pg_dump emits a non lossy command).

Here's the comment I've been adding to gram.y to explain that. Note that
I wouldn't expect to find such a comment at this place, but didn't see
another place where to detail it…

/*
* This option is needed only when the script is given
* inline. We then don't have any control file but still
* need to set the search_path according to the dependency
* list when running the extension's script.
*/
if (strcmp($1, "requires") == 0)

> 3) There are spelling mistakes in the existing comments, for example:

Fixed, thanks.

> 4) 'if_no_exits' flag of new grammar rule is set to false, even though the
> rule states that the extension is created as 'IF NOT EXISTS'.

Oh, a copy/paste bug, the worse kind. Fixed, thanks.

> 5) Why 'relocatable' option is added to the new grammar rule of create
> extension? In a regular 'create extension' statement, when 'schema' option
> is specified, it means that the extension is relocatable. So why is option
> needed? If there is a specific reason why 'scheme' option wasn't used, then
> please explain.

When schema option is specificied, it does NOT mean that the extension
is relocatable. The relocatable option is choosen by the extension's
author to cope with some dependencies and other problems, and is
important for reading the script (@extschema@ substitutions). Also it's
then essential to disallow ALTER EXTENSION … SET SCHEMA … even if that
very command could work, so we keep that parameter in the catalogs.

It's one of the only two control parameters that still needs to be
passed on the SQL command now.

> 6) I am unable to figure out why new 'require' option is needed(comments
> would have helped)? Is it because when we will dump the extension with -X
> flag, it will first dump that referenced extension?

That's not how it works, no. We can of course review that, but I don't
currently see any reason to force dumping required extensions scripts.

The 'require' processing didn't change at all in this patch. You are
just allowed to give the list in the SQL command rather than the control
file, that's all.

> I created the following extension(hstore was already created):
> create extension testex with version '1' requires 'hstore' as
> $testex$ create type testtype as(a char) $testex$;
>
> I then dumped the extension using -X option, only 'testex' extension was
> dumped and not 'hstore'.

I suppose that at restore time you will have to have "hstore" on disk
with its .so file, but maybe won't have "testex" readily deployed on the
new server. So my understanding is that what's happening is what needs
to happen. What do you think?

> 7) You have removed an old PG comments in pg_dump.c line:7526 above the
> following check:

Oops. Added back. Thanks.

> 8) In extension.c the following check is added:
> /*
> * If the extension script is not in the command, then the user is not
> * the extension packager and we want to read about relocatable and
> * requires in the control file, not in the SQL command.
> */
> if (d_relocatable || d_requires)
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("parameter \"%s\" is only expected when using CREATE EXTENSION AS",
> d_relocatable ? "relocatable" : "requires")));
>
> Is the above code really reachable? If the user has not specified a
> script(old create extension syntax is used), then 'd_relocatable' and
> 'd_requires' will never be true, because the old 'create extension' syntax
> doesn't allow it.

The grammar itself is not protecting against sloppy mixing of options,
and I don't think it should in our case. When shipping an extension as
an OS package with a control file, I suppose that the "packager options"
are not meant to be easily overriden by the DBA.

We could open that defensive coding, though, if the need is justified.

> 9) For every extension being dumped, and then for every object inside that
> extension. The code traverses all the fetched items of the database to
> compare if that item belong to that specific extension. Because of that
> 'DumpableObject **dobjs' is traversed fully, every time a single extension
> is dumped. This seems to be big big performance hit for a huge database.
> And considering if many many extensions are dumped in a huge database, dump
> might become very slow. Can you think of any other scheme? Maybe you should
> follow PG's design, and fetch the extension objects in dumpExtension() and
> dump them.

It seems to me that my implementation in the patch follows PG design
here. Please consider that the new traversal only happens once per
extension listed in the --extension-script accumulative option.

In a previous (local) implementation of this patch I kept a list of
objects per extension that was maintained in getExtensionMembership(),
but it stroke me as useless and I finally prefered tracking the
extension's OID in the DumpableObject structure directly. That makes for
much simpler code.

I could rewrite the tracking the other way round if that's deemed
necessary, it's not done in the attached, though.

> 10) pg_dumpall is not handled at all. User can't use -X switch if he wants
> to dump all the databases with extensions.

True, I didn't have a look at pg_dumpall. Extensions are per-database in
our catalogs… do we want to have the --extension-script option available
to pg_dumpall?!

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachment Content-Type Size
dump_extension_script.v1.patch text/x-patch 24.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-12-05 16:11:23 Re: PITR potentially broken in 9.2
Previous Message Phil Sorber 2012-12-05 15:29:38 Re: [WIP] pg_ping utility