Re: Hook for extensible parsing.

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Jim Mlodgenski <jimmy76(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Hook for extensible parsing.
Date: 2021-07-07 09:25:56
Message-ID: CAOBaU_Y5XoVb+aQG2Ub3y5V4heEkTo6D3e8uMwqFEC-h5aZ+dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review Jim!

On Wed, Jul 7, 2021 at 3:26 AM Jim Mlodgenski <jimmy76(at)gmail(dot)com> wrote:
>
> On Sat, Jun 12, 2021 at 4:29 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> The patches all build properly and pass all regressions tests.

Note that the cfbot reports a compilation error on windows. That's on
the grammar extension part, so I'm really really interested in trying
to fix that for now, as it's mostly a quick POC to demonstrate how one
could implement a different grammar and validate that everything works
as expected.

Also, if this patch is eventually committed and having some code to
experience the hook is wanted it would probably be better to have a
very naive parser (based on a few strcmp() calls or something like
that) to validate the behavior rather than having a real parser.

> > pg_parse_query() will instruct plugins to parse a query at a time. They're
> > free to ignore that mode if they want to implement the 3rd mode. If so, they
> > should either return multiple RawStmt, a single RawStmt with a 0 or
> > strlen(query_string) stmt_len, or error out. Otherwise, they will implement
> > either mode 1 or 2, and they should always return a List containing a single
> > RawStmt with properly set stmt_len, even if the underlying statement is NULL.
> > This is required to properly skip valid strings that don't contain a
> > statements, and pg_parse_query() will skip RawStmt that don't contain an
> > underlying statement.
>
> Wouldn't we want to only loop through the individual statements if parser_hook
> exists? The current patch seems to go through the new code path regardless
> of the hook being grabbed.

I did think about it, but I eventually chose to write it this way.
Having a different code path for the no-hook situation won't make the
with-hook code any easier (it should only remove some check for the
hook in some places that have 2 or 3 other checks already). On the
other hand, having a single code path avoid some (minimal) code
duplication, and also ensure that the main loop is actively tested
even without the hook being set. That's not 100% coverage, but it's
better than nothing. Performance wise, it shouldn't make any
noticeable difference for the no-hook case.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ronan Dunklau 2021-07-07 09:32:03 Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Previous Message Julien Rouhaud 2021-07-07 09:09:21 Re: track_planning causing performance regression