Re: General Parser

From: Ulrich Meis <kenobi(at)halifax(dot)rwth-aachen(dot)de>
To: Oliver Jowett <oliver(at)opencloud(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: General Parser
Date: 2004-11-01 14:39:24
Message-ID: 200411011539.24508.kenobi@halifax.rwth-aachen.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Hi Oliver!

On Monday 01 November 2004 12:12, you wrote:
> That said, this still needs some work as the various different callers
> need different parsing done. For example, we only care about splitting
> queries into multiple statements when using the V3 protocol, we only
> care about ? placeholders when parsing a query supplied via
> prepareStatement, the { call } syntax only makes sense for
> CallableStatements, and there's no need to disassemble the SELECT etc if
> the application never uses an updatable resultset.

I thought about these aspects and in a first draft I splitted off the
disassembly of select statements in a separate method. But finally, I merged
them again because I thought the overhead would be neglectable.
I never considered splitting off the others because they really only involve
one char comparison per word(literals/identifiers being one word) or less.

All the handling you describe above is done in the addFraction method and that
is roughly called once per word. There's not a single loop so we don't add to
complexity.
If not neglectable in the scope of the parser itself, I would think that it is
regarding the whole query execution, i.e. parsing,data conversion, sending
the query, the backend handling the query, receiving the results, data
conversion.
If you have an application that fires tons of queries that the backend needs
almost no time to execute for, then those queries are likely to be very
short. In that case parsing is quite quickly, too.

The cases in detail:

As said, all this happens once per word:

1. Splitting multiple statements.

if (first == SEMICOLON) {....}

That char comparison is all the overhead if it's a single query only.
Maybe it would even be nice to be able to throw an exception if multiple
queries are used in V2?

2. ? placeholders

if (first == QUESTIONMARK) {...}

As above.

3. call syntax

if (first == QUESTIONMARK) {...}

if (fraction.equalsIgnoreCase("call")) {...}

Those two comparisons are only done on the first word in a Java Escape and
otherwise never executed!

4. select statement disassembly

This is the only case I would consider to split off although I think it's not
necessary.
The whole section is completely ignored if the following holds true, i.e. it's
not a select statement and/or we already know it's not to a single table:

if (querytype == NONSINGLE) break;

Otherwise, we have on average less than one string comparison per word.

> I'm not sure how easy it will be to have a single flexible parser and
> still have low overhead for the cases where we don't want to completely
> disassemble the query.

Most of the overhead here comes from determining the tableName for an
updatable ResultSet. I can easily move that handling into a separate method.
In that case, I would make the parser keep the fractions ArrayList, so that
when the table name is actually needed in a ResultSet, the parsing can be
done real quick. Tell me if you prefer it that way and I'll do it.

I would really recommend to keep the other handling where it is, I don't think
it is neccessary to complicate things here. But tell me if I'm wrong.

Regards,

Uli

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Ulrich Meis 2004-11-01 14:50:36 Re: General Parser
Previous Message Oliver Jowett 2004-11-01 11:12:16 Re: General Parser