Re: build remaining Flex files standalone

From: Andres Freund <andres(at)anarazel(dot)de>
To: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: build remaining Flex files standalone
Date: 2022-08-15 18:11:46
Message-ID: 20220815181146.gf3qittblwf3tfcs@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for your work on this!

On 2022-08-13 15:39:06 +0700, John Naylor wrote:
> Here are the rest. Most of it was pretty straightforward, with the
> main exception of jsonpath_scan.c, which is not quite finished. That
> one passes tests but still has one compiler warning. I'm unsure how
> much of what is there already is really necessary or was cargo-culted
> from elsewhere without explanation. For starters, I'm not sure why the
> grammar has a forward declaration of "union YYSTYPE". It's noteworthy
> that it used to compile standalone, but with a bit more stuff, and
> that was reverted in 550b9d26f80fa30. I can hack on it some more later
> but I ran out of steam today.

I'm not sure either...

> Other questions thus far:
>
> - "BISONFLAGS += -d" is now in every make file with a .y file -- can
> we just force that everywhere?

Hm. Not sure it's worth it, extensions might use our BISON stuff...

> - Include order seems to matter for the grammar's .h file. I didn't
> test if that was the case every time, and after a few miscompiles just
> always made it the last inclusion, but I'm wondering if we should keep
> those inclusions outside %top{} and put it at the start of the next
> %{} ?

I think we have a few of those dependencies already, see e.g.
/*
* NB: include gram.h only AFTER including scanner.h, because scanner.h
* is what #defines YYLTYPE.
*/

> From d723ba14acf56fd432e9e263db937fcc13fc0355 Mon Sep 17 00:00:00 2001
> From: John Naylor <john(dot)naylor(at)postgresql(dot)org>
> Date: Thu, 11 Aug 2022 19:38:37 +0700
> Subject: [PATCH v201 1/9] Build guc-file.c standalone

Might be worth doing some of the moving around here separately from the
parser/scanner specific bits.

> +/* functions shared between guc.c and guc-file.l */
> +extern int guc_name_compare(const char *namea, const char *nameb);
> +extern ConfigVariable *ProcessConfigFileInternal(GucContext context,
> + bool applySettings, int elevel);
> +extern void record_config_file_error(const char *errmsg,
> + const char *config_file,
> + int lineno,
> + ConfigVariable **head_p,
> + ConfigVariable **tail_p);
>
> /*
> * The following functions are not in guc.c, but are declared here to avoid
> --
> 2.36.1
>

I think I prefer your suggestion of a guc_internal.h upthread.

> From 7d4ecfcb3e91f3b45e94b9e64c7c40f1bbd22aa8 Mon Sep 17 00:00:00 2001
> From: John Naylor <john(dot)naylor(at)postgresql(dot)org>
> Date: Fri, 12 Aug 2022 15:45:24 +0700
> Subject: [PATCH v201 2/9] Build booscanner.c standalone

> -# bootscanner is compiled as part of bootparse
> -bootparse.o: bootscanner.c
> +# See notes in src/backend/parser/Makefile about the following two rules
> +bootparse.h: bootparse.c
> + touch $@
> +
> +bootparse.c: BISONFLAGS += -d
> +
> +# Force these dependencies to be known even without dependency info built:
> +bootparse.o bootscan.o: bootparse.h

Wonder if we could / should wrap this is something common. It's somewhat
annoying to repeat this stuff everywhere.

> diff --git a/src/test/isolation/specscanner.l b/src/test/isolation/specscanner.l
> index aa6e89268e..2dc292c21d 100644
> --- a/src/test/isolation/specscanner.l
> +++ b/src/test/isolation/specscanner.l
> @@ -1,4 +1,4 @@
> -%{
> +%top{
> /*-------------------------------------------------------------------------
> *
> * specscanner.l
> @@ -9,7 +9,14 @@
> *
> *-------------------------------------------------------------------------
> */
> +#include "postgres_fe.h"

Miniscule nitpick: I think we typically leave an empty line between header and
first include.

> diff --git a/contrib/cube/cubedata.h b/contrib/cube/cubedata.h
> index dbe7d4f742..0b373048b5 100644
> --- a/contrib/cube/cubedata.h
> +++ b/contrib/cube/cubedata.h
> @@ -67,3 +67,7 @@ extern void cube_scanner_finish(void);
>
> /* in cubeparse.y */
> extern int cube_yyparse(NDBOX **result);
> +
> +/* All grammar constructs return strings */
> +#define YYSTYPE char *

Why does this need to be defined in a semi-public header? If we do this in
multiple files we'll end up with the danger of macro redefinition warnings.

> +extern int scanbuflen;

The code around scanbuflen seems pretty darn grotty. Allocating enough memory
for the entire list by allocating the entire string size... I don't know
anything about contrib/cube, but isn't that in effect O(inputlen^2) memory?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2022-08-15 18:34:31 Re: [PATCH] Optimize json_lex_string by batching character copying
Previous Message Melih Mutlu 2022-08-15 17:03:36 Re: Allow logical replication to copy tables in binary format