Re: failing to build preproc.c on solaris with sun studio

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Noah Misch <noah(at)leadboat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: failing to build preproc.c on solaris with sun studio
Date: 2022-08-07 07:47:36
Message-ID: CAFBsxsEwr13sCw_uig1YbxQ8RhYe9J9_ZJ4orGqfcS-=pyxRvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Aug 7, 2022 at 7:05 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Even on a modern Linux:
>
> $ size src/backend/parser/gram.o
> text data bss dec hex filename
> 656568 0 0 656568 a04b8 src/backend/parser/gram.o
> $ size src/interfaces/ecpg/preproc/preproc.o
> text data bss dec hex filename
> 912005 188 7348 919541 e07f5 src/interfaces/ecpg/preproc/preproc.o
>
> So there's something pretty bloated there. It doesn't seem like
> ecpg's additional productions should justify a nigh 50% code
> size increase.

Comparing gram.o with preproc.o:

$ objdump -t src/backend/parser/gram.o | grep yy | grep -v
UND | awk '{print $5, $6}' | sort -r | head -n3
000000000003a24a yytable
000000000003a24a yycheck
0000000000013672 base_yyparse

$ objdump -t src/interfaces/ecpg/preproc/preproc.o | grep yy | grep -v
UND | awk '{print $5, $6}' | sort -r | head -n3
000000000004d8e2 yytable
000000000004d8e2 yycheck
000000000002841e base_yyparse

The largest lookup tables are ~25% bigger (other tables are trivial in
comparison), and the function base_yyparse is about double the size,
most of which is a giant switch statement with 2510 / 3912 cases,
respectively. That difference does seem excessive. I've long wondered
if it would be possible / feasible to have more strict separation for
each C, ECPG commands, and SQL. That sounds like a huge amount of
work, though.

Playing around with the compiler flags on preproc.c, I get these
compile times, gcc memory usage as reported by /usr/bin/time -v , and
symbol sizes (non-debug build):

-O2:
time 8.0s
Maximum resident set size (kbytes): 255884

-O1:
time 6.3s
Maximum resident set size (kbytes): 170636
000000000004d8e2 yytable
000000000004d8e2 yycheck
00000000000292de base_yyparse

-O0:
time 2.9s
Maximum resident set size (kbytes): 153148
000000000004d8e2 yytable
000000000004d8e2 yycheck
000000000003585e base_yyparse

Note that -O0 bloats the binary probably because it's not using a jump
table anymore. O1 might be worth it just to reduce build times for
slower animals, even if Noah reported this didn't help the issue
upthread. I suspect it wouldn't slow down production use much since
the output needs to be compiled anyway.

--
John Naylor
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-08-07 07:56:00 Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?
Previous Message Michael Paquier 2022-08-07 07:41:08 Re: [PATCH] Expose port->authn_id to extensions and triggers