Re: benchmarking Flex practices

From: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: benchmarking Flex practices
Date: 2019-07-05 10:54:16
Message-ID: CACPNZCt8eyDFM1nst6XPXFGUs5fD+GT-ykmWwVUZh1n1ixu+Aw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 3, 2019 at 5:35 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> As far as I can see, the point of 0002 is to have just one set of
> flex rules for the various variants of quotecontinue processing.
> That sounds OK, though I'm a bit surprised it makes this much difference
> in the table size. I would suggest that "state_before" needs a less
> generic name (maybe "state_before_xqs"?) and more than no comment.
> Possibly more to the point, it's not okay to have static state variables
> in the core scanner, so that variable needs to be kept in yyextra.

v4-0001 is basically the same as v3-0002, with the state variable in
yyextra. Since follow-on patches use it as well, I've named it
state_before_quote_stop. I failed to come up with a nicer short name.
With this applied, the transition table is reduced from 37045 to
30367. Since that's uncomfortably close to the 32k limit for 16 bit
members, I hacked away further at UESCAPE bloat.

0002 unifies xusend and xuiend by saving the state of xui as well.
This actually causes a performance regression, but it's more of a
refactoring patch to prevent from having to create two additional
start conditions in 0003 (of course it could be done that way if
desired, but the savings won't be as great). In any case, the table is
now down to 26074.

0003 creates a separate start condition so that UESCAPE and the
expected quoted character after it are detected in separate states.
This allows us to use standard whitespace skipping techniques and also
to greatly simplify the uescapefail rule. The final size of the table
is 23696. Removing UESCAPE entirely results in 21860, so this likely
the most compact size of this feature.

Performance is very similar to HEAD. Parsing the information schema
might be a hair faster and pgbench-like queries with simple strings a
hair slower, but the difference seems within the noise of variation.
Parsing strings with UESCAPE likewise seems about the same.

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v4-0001-Replace-the-Flex-quotestop-rules-with-a-new-exclu.patch application/octet-stream 6.2 KB
v4-0002-Unify-xuiend-and-xusend-into-a-single-start-condi.patch application/octet-stream 5.7 KB
v4-0003-Use-separate-start-conditions-for-both-UESCAPE-an.patch application/octet-stream 3.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2019-07-05 10:54:40 Re: Index Skip Scan
Previous Message Will Bryant 2019-07-05 10:42:35 Changing GENERATED ALWAYS AS expression