Re: benchmarking Flex practices

From: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: benchmarking Flex practices
Date: 2020-01-13 10:46:01
Message-ID: CACPNZCvsq4qaXAF3Tc1ZvQrR2zDi4q6gLZJeZ1AyOXSkD6CdSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 13, 2020 at 7:57 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Hmm ... after a bit of research I agree that these functions are not
> a portability hazard. They are present at least as far back as flex
> 2.5.33 which is as old as we've got in the buildfarm.
>
> However, I'm less excited about them from a performance standpoint.
> The BEGIN() macro expands to (ordinarily)
>
> yyg->yy_start = integer-constant
>
> which is surely pretty cheap. However, yy_push_state is substantially
> more expensive than that, not least because the first invocation in
> a parse cycle will involve a malloc() or palloc(). Likewise yy_pop_state
> is multiple times more expensive than plain BEGIN().
>
> Now, I agree that this is negligible for ECPG's usage, so if
> pushing/popping state is helpful there, let's go for it. But I am
> not convinced it's negligible for the backend, and I also don't
> see that we actually need to track any nested scanner states there.
> So I'd rather stick to using BEGIN in the backend. Not sure about
> psql.

Okay, removed in v11. The advantage of stack functions in ECPG was to
avoid having the two variables state_before_str_start and
state_before_str_stop. But if we don't use stack functions in the
backend, then consistency wins in my mind. Plus, it was easier for me
to revert the stack functions for all 3 scanners.

> BTW, while looking through the latest patch it struck me that
> "UCONST" is an underspecified and potentially confusing name.
> It doesn't indicate what kind of constant we're talking about,
> for instance a C programmer could be forgiven for thinking
> it means something like "123U". What do you think of "USCONST",
> following UIDENT's lead of prefixing U onto whatever the
> underlying token type is?

Makes perfect sense. Grepping through the source tree, indeed it seems
the replication command scanner is using UCONST for digits.

Some other cosmetic adjustments in ECPG parser.c:
-Previously I had a WIP comment in about 2 functions that are copies
from elsewhere. In v11 I just noted that they are copied.
-I thought it'd be nicer if ECPG spelled UESCAPE in caps when
reconstructing the string.
-Corrected copy-paste-o in comment

Also:
-reverted some spurious whitespace changes
-revised scan.l comment about the performance benefits of no backtracking
-split the ECPG C-comment scanning cleanup into a separate patch, as I
did for v6. I include it here since it's related (merging scanner
states), but not relevant to making the core scanner smaller.
-wrote draft commit messages

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v11-0001-Reduce-size-of-backend-scanner-transition-array.patch application/octet-stream 60.5 KB
v11-0002-Merge-ECPG-scanner-states-regarding-C-comments.patch application/octet-stream 3.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-01-13 10:57:53 Re: our checks for read-only queries are not great
Previous Message Paul Guo 2020-01-13 10:27:16 Re: standby recovery fails (tablespace related) (tentative patch and discussion)