Re: [RFC] nodeToString format and exporting the SQL parser

From: "Jehan-Guillaume (ioguix) de Rorthais" <ioguix(at)free(dot)fr>
To: pgsql-hackers(at)postgresql(dot)org
Cc: David Fetter <david(at)fetter(dot)org>, Markus Wanner <markus(at)bluegap(dot)ch>, Michael Tharp <gxti(at)partiallystapled(dot)com>
Subject: Re: [RFC] nodeToString format and exporting the SQL parser
Date: 2010-04-21 17:40:42
Message-ID: 4BCF389A.2090308@free.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/04/2010 18:10, David Fetter wrote:
> On Sat, Apr 03, 2010 at 03:17:30PM +0200, Markus Schiltknecht wrote:
>> Hi,
>>
>> Michael Tharp wrote:
>>> I have been spending a little time making the internal SQL parser
>>> available to clients via a C-language SQL function.
>>
>> This sounds very much like one of the Cluster Features:
>> http://wiki.postgresql.org/wiki/ClusterFeatures#API_into_the_Parser_.2F_Parser_as_an_independent_module
>>
>> Is this what you (or David) have in mind?
>
> I'm not a fan of statement-based replication of any description. The
> use cases I have in mind involve things like known-correct syntax
> highlighting in text editors.

The point here is not to expose the internal data structure, but to
deliver a tokenized version of the given SQL script.

There's actually many different use cases for external projects :
- syntax highlighting
- rewrite query with proper indentation
- replication
- properly splitting queries from a script
- define type of the query (SELECT ? UPDATE/DELETE ? DDL ?)
- checking validity of a query before sending it
- ...

In addition of PgPool needs, I can see 3 or 4 direct use cases for
pgAdmin and phpPgAdmin.

So it seems to me having the parser code in a shared library would be
very useful for external C projects which can link to it. However it
would be useless for other non-C projects which can't use it directly
but are connected to a PostgreSQL backend anyway (phpPgAdmin as instance).

What about having a new SQL command like TOKENIZE ? it would kinda act
like EXPLAIN but giving a tokenized version of the given SQL script. As
EXPLAIN, it could speak XML, YAML, JSON, you name it...

Each token could have :
- a type ('identifier', 'string', 'sql command', 'sql keyword',
'variable'...)
- the start position in the string
- the value
- the line number
- ...

A simple example of a tokenizer is the php one:
http://fr.php.net/token_get_all

And here is a basic example which return pseudo rows here :

=> TOKENIZE $script$
SELECT 1;
UPDATE test SET "a"=2;
$script$;

type | pos | value | line
- -------------+-----+----------+------
SQL_COMMAND | 1 | 'SELECT' | 1
CONSTANT | 8 | '1' | 1
DELIMITER | 9 | ';' | 1
SQL_COMMAND | 11 | 'UPDATE' | 2
IDENTIFIER | 18 | 'test' | 2
SQL_KEYWORD | 23 | 'SET' | 2
IDENTIFIER | 27 | '"a"' | 2
OPERATOR | 30 | '=' | 2
CONSTANT | 31 | '1' | 2

>
> Cheers,
> David.

As a phpPgAdmin dev, I am thinking about this subject since a long time.
I am interested about trying to create such a patch after discussing it
and if you think it is doable.

- --
JGuillaume (ioguix) de Rorthais
http://www.dalibo.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkvPOJMACgkQxWGfaAgowiLrUACfa7qMVr3oiOVS7JfhTa1S9EqY
pYkAn3Sj6cezC/EdWPu2+kzrgjaDygGE
=oY1c
-----END PGP SIGNATURE-----

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-04-21 17:56:03 Re: [HACKERS] Streaming replication document improvements
Previous Message Robert Haas 2010-04-21 17:29:15 Re: [HACKERS] Streaming replication document improvements