Path: readme1.op.net!op.net!cezanne.op.net!op.net!newsfeed.direct.ca!news-peer.sprintlink.net!news-backup-west.sprintlink.net!news.sprintlink.net!208.31.42.33!iecc.com!iecc.com!not-for-mail From: Mark Sanvitale Newsgroups: comp.compilers Subject: why not inline all functions? Date: 9 Jun 1998 12:03:34 -0400 Organization: Teradyne, Incorporated Lines: 57 Sender: johnl@iecc.com Approved: compilers@ivan.iecc.com Message-ID: <98-06-032@comp.compilers> NNTP-Posting-Host: ivan.iecc.com Keywords: performance Xref: readme1.op.net comp.compilers:5583 Functions are great for making written code (C, C++, etc.) mode readable and structured, however, they do not seem to make much sense when you get down to the raw machine code which actually is executed by a processor. As far as my understanding of the matter goes, the most basic way to slow down a processor is to make it execute an instruction besides the one immediately following the current instruction, thus, why not make a compiler which turns every function into an inline function? This would save you the overhead inherent in a traditional function call (push everything defining the current state of the processor on the stack, make fresh copies of the parameters for the function, and, afterwards, pop things off the stack to return the processor to the pre-function state, not to mention losing the chance to take advantage of any instruction prefetching the processor might do). The output of such a compiler would be larger binary files (since every call to a function would expand to the entire function body) however the execution time for such a program should be improved (relative to a non-inlining compiler) by a factor proportional to the number of function calls in the program. Now, a "inline everything" scheme might run into some roadblocks when it comes to external functions which are resolved at link time and the notion of dynamic linking is not compatible with such a method. Still, I think compilers should try to inline every function it can without depending on the programmer to specify a function as "inline" (C++). Perhaps compilers already take advantage of the idea I have outlined or perhaps there are some problems with the idea which I don't know about (an old C++ book I have says, "Compiler limits prevent complicated functions from being inlined," but no further explanation is given. What do you all think? NOTE - During my quest for a CS degree I did not have the chance to take a compilers course (it was a tech elective which never fit in my schedule) so if my assertion is totally pointless please just point out the reason I am wrong and spare me the "Are you stupid/crazy/lost" comments. In the area of compilers I admit to being fairly ignorant (as far as professional programmers go). Mark S "computer geek and movie freak" [Well, if you ignore recursive and mutually recursive functions which can't be in-lined, the main problem is code bloat. Remember that since computers have fairly small caches and finite main memory, big code can run slower due to cache reloads and page faults, even though you aviod relatively slow procedure calls and returns. Aggressive optimizing compilers (usually for Fortran) do indeed do a lot of in-lining already, though. -John] -- Send compilers articles to compilers@iecc.com, meta-mail to compilers-request@iecc.com. Archives at http://www.iecc.com/compilers Path: readme1.op.net!op.net!cezanne.op.net!op.net!nntp-out.monmouth.com!newspeer.monmouth.com!news-peer-east.sprintlink.net!news-peer.sprintlink.net!news-backup-west.sprintlink.net!news.sprintlink.net!208.31.42.33!iecc.com!iecc.com!not-for-mail From: Joerg Schoen Newsgroups: comp.compilers Subject: Re: why not inline all functions? Date: 11 Jun 1998 16:11:47 -0400 Organization: University of Heidelberg, Germany Lines: 64 Sender: johnl@iecc.com Approved: compilers@ivan.iecc.com Message-ID: <98-06-050@comp.compilers> References: <98-06-032@comp.compilers> NNTP-Posting-Host: ivan.iecc.com Keywords: performance, practice Xref: readme1.op.net comp.compilers:5604 Mark Sanvitale wrote: : Functions are great for making written code (C, C++, etc.) mode : readable and structured, however, they do not seem to make much sense : when you get down to the raw machine code which actually is executed : by a processor. : As far as my understanding of the matter goes, the most basic way to : slow down a processor is to make it execute an instruction besides the : one immediately following the current instruction, thus, why not make : a compiler which turns every function into an inline function? This : would save you the overhead inherent in a traditional function call : (push everything defining the current state of the processor on the In my experience "pushing the current state on the stack" refers only to variables that reside in processor registers. If the function is small, it will be inlined (at sufficiently high optimization level) and no pushs are necessary. If the function is big and a function call is done, it is more likely that some time is spent in the function and the processors registers are used therein for performance. Thus it is reasonable to "free" them for reusage in the function by pushing away them. : stack, make fresh copies of the parameters for the function, and, : afterwards, pop things off the stack to return the processor to the : pre-function state, not to mention losing the chance to take advantage : of any instruction prefetching the processor might do). : The output of such a compiler would be larger binary files (since : every call to a function would expand to the entire function body) : however the execution time for such a program should be improved : (relative to a non-inlining compiler) by a factor proportional to the : number of function calls in the program. No, that's not true. Consider a long function or a short one with a loop that is executed a couple of times. You then can neglect the cost of calling the function versus the time spent in the function itself. : Now, a "inline everything" scheme might run into some roadblocks when : it comes to external functions which are resolved at link time and the : notion of dynamic linking is not compatible with such a method. I know that some compilers have an "ucode" format that is different to the usual object file format (which is used in libraries). As far as my understanding goes, compilers can do much more with the ucode format in the linking stage, I think they can also do inlining. : Still, I think compilers should try to inline every function it can : without depending on the programmer to specify a function as "inline" : (C++). As our moderator pointed out, you have to consider the cost of loading new instructions into the cache. If the function is a separate code, it will be in the cache after the first call and probably stay there. That improves performance compared to the case of inlined functions that consist of separate code blocks that have all to be loaded into the cache. Joerg Schoen E-mail: Joerg.Schoen AT tc DOT pci DOT uni-heidelberg DOT de Web-Page: http://www.pci.uni-heidelberg.de/tc/usr/joerg -- Send compilers articles to compilers@iecc.com, meta-mail to compilers-request@iecc.com. Archives at http://www.iecc.com/compilers Path: readme1.op.net!op.net!cezanne.op.net!op.net!nntp-out.monmouth.com!newspeer.monmouth.com!news-peer-east.sprintlink.net!news-peer.sprintlink.net!news-backup-west.sprintlink.net!news.sprintlink.net!208.31.42.33!iecc.com!iecc.com!not-for-mail From: Ben Elliston Newsgroups: comp.compilers Subject: Re: why not inline all functions? Date: 11 Jun 1998 16:13:27 -0400 Organization: Cygnus Solutions Lines: 52 Sender: johnl@iecc.com Approved: compilers@ivan.iecc.com Message-ID: <98-06-052@comp.compilers> References: <98-06-032@comp.compilers> NNTP-Posting-Host: ivan.iecc.com Keywords: optimize, practice Xref: readme1.op.net comp.compilers:5602 Mark Sanvitale writes: > Functions are great for making written code (C, C++, etc.) mode > readable and structured, however, they do not seem to make much sense > when you get down to the raw machine code which actually is executed > by a processor. There is still one purpose for retaining this structure in the executable. I'll get to this below. > The output of such a compiler would be larger binary files (since > every call to a function would expand to the entire function body) > however the execution time for such a program should be improved > (relative to a non-inlining compiler) by a factor proportional to the > number of function calls in the program. I remember reading somewhere that, in fact, by inlining functions like this, the potential for optimisations is greater. My intuition can see why. But code bloat would still be significant. > Perhaps compilers already take advantage of the idea I have outlined or > perhaps there are some problems with the idea which I don't know about > (an old C++ book I have says, "Compiler limits prevent complicated > functions from being inlined," but no further explanation is given. In my opinion, the major drawback to inline functions is that they are much harder to debug. Suppose you have code which frequently calls: my_sqrt(..) And you suspect a bug in your square root function. If my_sqrt() were inlined at every point that it was used, it would be impossible to set a breakpoint at the start of my_sqrt() and to examine an invocation of this function. It becomes a real headache. Furthermore, it helps a great deal to know the calling sequence at runtime in case, say, an assertion fails and you need to know how your function was called and what the arguments were. --- Ben Elliston bje@cygnus.com [That's not a very persuasive argument. There's no reason the debugger can't know all the places the inline function was called and put breakpoints on all of them. It has to do something similar in compilers that unwind loops already. -John] -- Send compilers articles to compilers@iecc.com, meta-mail to compilers-request@iecc.com. Archives at http://www.iecc.com/compilers Path: readme1.op.net!op.net!cezanne.op.net!op.net!nntp-out.monmouth.com!newspeer.monmouth.com!news-peer-east.sprintlink.net!news-peer.sprintlink.net!news-backup-west.sprintlink.net!news.sprintlink.net!208.31.42.33!iecc.com!iecc.com!not-for-mail From: Andy Ayers Newsgroups: comp.compilers Subject: Re: why not inline all functions? Date: 11 Jun 1998 16:15:13 -0400 Organization: MIT Laboratory for Computer Science Lines: 59 Sender: johnl@iecc.com Approved: compilers@ivan.iecc.com Message-ID: <98-06-055@comp.compilers> References: <98-06-032@comp.compilers> NNTP-Posting-Host: ivan.iecc.com Keywords: optimize, performance Xref: readme1.op.net comp.compilers:5603 Mark Sanvitale writes: > ... why not make > a compiler which turns every function into an inline function? This > would save you the overhead inherent in a traditional function call > (push everything defining the current state of the processor on the > stack, make fresh copies of the parameters for the function, and, > afterwards, pop things off the stack to return the processor to the > pre-function state, not to mention losing the chance to take advantage > of any instruction prefetching the processor might do). > ... > Perhaps compilers already take advantage of the idea I have outlined or > perhaps there are some problems with the idea which I don't know about > (an old C++ book I have says, "Compiler limits prevent complicated > functions from being inlined," but no further explanation is given. > > What do you all think? While working on high-level optimization at HP, some colleagues and I essentially pursued a similar idea. Our goal was not so much to remove call overhead as it was to open up larger expanses of code to aggressive intraprocedural optimization. Some of the results are summarized in a paper "Aggressive Inlining" which we presented at PLDI last year. HP-UX compilers from 9.x onwards are capable of aggressive, profile-driven, cross-module inlining. Interprocedural optimization was (and is) often hampered by the complexity and scale of the supporting interprocedural analyses and whole-program information is often required to get good results. Once you've done the analysis you often end up duplicating callee code to take advantage of some specialized calling context. Inlining does this duplication and specialization for you without requiring much heavyweight analysis. We saw very good results on most codes -- benchmarks, certainly, but also many user and production codes and some parts of the HPUX kernel. Fortran was actually trickier than C or C++ because the formal parameter aliasing properties are difficult to describe accurately once a subroutine is inlined. The only program I remember us actually "flattening" completely was the spec benchmark fpppp. On the flip side, getting good results with inlining requires profiling, but this is quickly becoming a must for any aggressive optimizing compiler. The code placement tool (ala Pettis & Hanson) needs to be inlining-aware. Code growth is not that big of a problem in many codes. Many very large codes have relatively small dynamic hot spots. Database codes are a notable exception. Another big downside is that aggressive cross-module inlining builds in cross-module dependences, so you lose the quick build-time benefits of separate compilation. Debugging runtime failures in heavily inlined code is a huge challenge. And finally, many user codes are not interprocedurally clean, making it hard to inline at some sites and preserve whatever twisted semantics were implied by the original call. -- Andy Ayers -- Send compilers articles to compilers@iecc.com, meta-mail to compilers-request@iecc.com. Archives at http://www.iecc.com/compilers Path: readme1.op.net!op.net!cezanne.op.net!op.net!newsfeed.direct.ca!news-peer.sprintlink.net!news-backup-east.sprintlink.net!news.sprintlink.net!208.31.42.33!iecc.com!iecc.com!not-for-mail From: mcdirmid@beaver.cs.washington.edu (Sean McDirmid) Newsgroups: comp.compilers Subject: Re: why not inline all functions? Date: 11 Jun 1998 16:59:26 -0400 Organization: Computer Science & Engineering, U of Washington, Seattle Lines: 46 Sender: johnl@iecc.com Approved: compilers@ivan.iecc.com Message-ID: <98-06-063@comp.compilers> References: <98-06-032@comp.compilers> NNTP-Posting-Host: ivan.iecc.com Keywords: optimize Xref: readme1.op.net comp.compilers:5607 Mark Sanvitale (sanvitam@std.teradyne.com) wrote: : Now, a "inline everything" scheme might run into some roadblocks when : it comes to external functions which are resolved at link time and the : notion of dynamic linking is not compatible with such a method. : Still, I think compilers should try to inline every function it can : without depending on the programmer to specify a function as "inline" : (C++). Hmm, you forgot to mention recursive functions and virtual methods in C++. In C++, sometimes you execute a method that can only be determined at runtime. That's why, if you "inline" a method that can be invoked virtually, the compiler will probably just compile it to a function for external accesses. Of course, there are tools like Vortex that are attempting to solve this problem through extreme global analysis. See: http://www.cs.washington.edu/research/projects/cecil/ : Perhaps compilers already take advantage of the idea I have outlined or : perhaps there are some problems with the idea which I don't know about : (an old C++ book I have says, "Compiler limits prevent complicated : functions from being inlined," but no further explanation is given. Over "inlining" leads to greater code bloat. This could adversely affect your instruction cache (or maybe not...). Whenever you optimize here, you might deoptimize somewhere else (isn't computer science fun!). : What do you all think? : NOTE - During my quest for a CS degree I did not have the chance to take : a compilers course (it was a tech elective which never fit in my : schedule) so if my assertion is totally pointless please just point out : the reason I am wrong and spare me the "Are you stupid/crazy/lost" : comments. In the area of compilers I admit to being fairly ignorant (as : far as professional programmers go). A compilers course probably would not have gone into these issues. Maybe an architecture course... Sean -- Send compilers articles to compilers@iecc.com, meta-mail to compilers-request@iecc.com. Archives at http://www.iecc.com/compilers Path: readme1.op.net!op.net!out2.nntp.cais.net!in1.nntp.cais.net!nntp.abs.net!news-peer-east.sprintlink.net!news-peer.sprintlink.net!news-backup-east.sprintlink.net!news.sprintlink.net!208.31.42.33!iecc.com!iecc.com!not-for-mail From: Thomas Niemann Newsgroups: comp.compilers Subject: Re: why not inline all functions? Date: 11 Jun 1998 17:00:10 -0400 Organization: Compilers Central Lines: 26 Sender: johnl@iecc.com Approved: compilers@ivan.iecc.com Message-ID: <98-06-065@comp.compilers> References: <98-06-032@comp.compilers> NNTP-Posting-Host: ivan.iecc.com Keywords: optimize, practice Xref: readme1.op.net comp.compilers:5613 I worked on compilers for Prime and Apollo in the 80's, and implemented procedure inlining for both. For testing, I tried to inline everything. I recall one program that seemingly "hung" the compiler. Being persistent, I let it compile overnight. It finally did, and executed properly. On inspection, the call graph for the code resembled a binary tree, and quickly resulted in a huge program that took hours to compile. The final released product tried to determine good candidates for inlining (another topic). As I recall, we didn't expand recursive routines, though we could have inlined a few times. Large programs not only take a long time to compile, but may incurr excessive paging, as there may be insufficient memory to hold the executable. The big win obtained from inlining is due to optimization. Inlining a procedure exposes the procedure's body to the surrounding code, allowing for classical optimizations to take place. One of the benchmarks in our test suite terminated with a divide-by-zero exception after inlining. A matrix-multiplication function was inlined. The optimizer could then determine that there was no use of the calculation, so the code was eliminated. Thus, it took zero time to do the calculation. The divide-by-zero error occurred when dividing elapsed time into a constant. At any rate, we had qualms about reporting benchmark figures for such a result. -- Send compilers articles to compilers@iecc.com, meta-mail to compilers-request@iecc.com. Archives at http://www.iecc.com/compilers