Re: Deparsing DDL command strings

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Deparsing DDL command strings
Date: 2012-10-05 15:54:46
Message-ID: m2k3v4lxrt.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Why don't you just pass the original query string, instead of writing
> a mass of maintenance-requiring new code to reproduce it?

Do we have that original query string in all cases, including EXECUTE
like spi calls from any PL? What about commands that internally set a
parsetree to feed ProcessUtility() directly? Do we want to refactor them
all just now as a prerequisite?

Also, we need to normalize that command string. Tools needing to look at
it won't want to depend on random white spacing and other oddities.
Those tools could also use the Node *parsetree and be written only in C,
but then what about giving them a head start by having a parsetree
walker in our code base?

Then we want to qualify object names. Some type names have already been
taken care of apparently by the parser here, relation names not yet and
we need to cope with non existing relation names.

My freshly grown limited understanding is that we currently only know
how to produce a "cooked" parse tree from the raw one if all referenced
objects do exist in the catalogs, so that we will postpone some
"cooking" (transform*) until the main object in a CREATE command are
defined, right?

Is that something we want to revisit?

Another option would be to capture search_path and other parse time
impacting GUCs, call that the query environment, and have a way to
serialize and pass in the environment and restore it either on the same
host or on another (replication is an important use case here).

Yet another option would be to output both the original query string and
something that's meant for easy machine parsing yet is not the internal
representation of the query, so that we're free to hack the parser at
will in between releases, even minor. Building that new code friendly
document will require about the same amount of code as spitting out
normalized SQL, I believe.

Yet another option would be to go the "sax" way rather than the "dom"
one: instead of spitting out a new command string have the user register
callbacks and only implement walking down the parsetree and calling
those. I'm not sure how much maintenance work we would save here, and
I'm not seeing another reason why going that way.

Yet another option would be to only provide for a hook and some space in
the EventTriggerData structure for extensions to register themselves and
provide whatever deparsing they need. But then we need to figure out a
way for the user defined function to use the resulting opaque data, from
any PL language, if only to be able to call some extension's API to
process it. Looks like a very messy way to punt the work outside of
core.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-10-05 16:15:01 Re: Deparsing DDL command strings
Previous Message Kohei KaiGai 2012-10-05 15:52:54 Re: 64-bit API for large object