Re: Hook for extensible parsing.

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Mlodgenski <jimmy76(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hook for extensible parsing.
Date: 2021-09-23 13:31:58
Message-ID: 20210923133158.trw2aibki5c3mivu@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 23, 2021 at 07:37:27AM +0100, Simon Riggs wrote:
> On Thu, 16 Sept 2021 at 05:33, Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> > Would any of that be a reasonable approach?
>
> The way I summarize all of the above is that
> 1) nobody is fundamentally opposed to the idea
> 2) we just need to find real-world example(s) and show that any
> associated in-core patch provides all that is needed in a clean way,
> since that point is currently in-doubt by senior committers.
>
> So what is needed is some actual prototypes that explore this. I guess
> that means they have to be open source, but those examples could be
> under a different licence, as long as the in-core patch is clearly a
> project submission to PostgreSQL.
>
> I presume a few real-world examples could be:
> * Grammar extensions to support additional syntax for Greenplum, Citus, XL
> * A grammar that adds commands for an extension, such as pglogical
> (Jim's example)
> * A strict SQL Standard grammar/parser
> * GQL implementation

As I mentioned, there's at least one use case that would work with that
approach that I will be happy to code in hypopg, which is an open source
project. As a quick prototype, here's a basic overview of how I can use this
hook to implement a CREATE HYPOTHETICAL INDEX command:

rjuju=# LOAD 'hypopg';
LOAD
rjuju=# create hypothetical index meh on t1(id);
CREATE INDEX
rjuju=# explain select * from t1 where id = 1;
QUERY PLAN
--------------------------------------------------------------------------------
Index Scan using "<13543>btree_t1_id" on t1 (cost=0.04..8.05 rows=1 width=13)
Index Cond: (id = 1)
(2 rows)

rjuju=# \d t1
Table "public.t1"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
id | integer | | |
val | text | | |

My POC's grammar is only like:

CREATE HYPOTHETICAL INDEX opt_index_name ON relation_expr '(' index_params ')'
{
IndexStmt *n = makeNode(IndexStmt);
n->idxname = $4;
n->relation = $6;
n->accessMethod = DEFAULT_INDEX_TYPE;
n->indexParams = $8;
n->options = list_make1(makeDefElem("hypothetical", NULL, -1));
$$ = (Node *) n;
}

as I'm not willing to import the whole CREATE INDEX grammar for now for a patch
that may be rejected. I can however publish this POC if that helps. Note
that once my parser returns this parse tree, all I need to do is intercept
IndexStmt containing this option in a utility_hook and run my code rather than
a plain DefineIndex(), which works as intended as I showed above.

One could easily imagine similar usage to extend existing commands, like
implementing a new syntax on top of CREATE TABLE to implement an automatic
partition creation grammar (which would return multiple CreateStmt),
or even a partition manager.

I'm not an expert in other RDBMS syntax, but maybe you could use such a
hook to implement SQL Server or mysql syntax, which use at least different
quoting rules. Maybe Amazon people could confirm that as it looks like they
implemented an SQL Server parser using a similar hook?

So yes you can't create new commands or implement grammars that require
additional semantic analysis with this hook, but I think that there are still
real use cases that can be implemented using only a different parser.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Marcos Pegoraro 2021-09-23 13:36:00 Re: DOC: Progress Reporting page
Previous Message Ranier Vilela 2021-09-23 13:27:43 Re: DOC: Progress Reporting page