I just recently talked to someone whose Swift framework(s) were compiling at roughly 16 lines per second. Spurred on by Jonathan Blow's musings on compile times for Jai[1], I started tinkering with tcc a little. It compiled a generated 200KLOC file (lots of small functions) in around 200 ms.
Then there are Smalltalk and Lisp systems that are always up and pretty much only ever compile the current method.
We also used to have true separate compilation, but that appears to be falling out of favor.
Of course none of these are C++ and they also don't optimize as well etc. Yet, how much of the code really needs to be optimized that well? And how much of that code needs to be recompiled that much?
So we know how to solve this in principle, we just aren't putting the pieces together.
Are you serious? You want to make a product: a Web browser. What technology are you going to choose? The one that makes your browser fast but gives you more work, or the one that makes your browser slower but makes compilation less of a nuisance to you? It's mind boggling that some would openly say that, hey, who cares about performance that much, these compilation times bother me, the developer. On a browser of all things! Would you be okay with your browser being 2x or 3x slower?
False dilemma. You can have both. It's okay if a fully optimized release mode binary takes a bit longer to compile, but compiling a few million lines of code for a debug build shouldn't take more than a second or two.
Also consider the leverage factor. Improvements to the compiler benefit all users of the programming language, so it's worthwhile to invest in high quality compilers.
Yes, we can use caching compilers (https://wiki.archlinux.org/index.php/ccache) to speed up builds with few changes. We can lower optimization levels (although that barely gives you an increase in compile speed compared to the runtime speed you lose and the fact that it makes your program do slightly different things).
There's no slider from "pessimum" to "optimum". You need to do wildly different things to optimize past this point for compile speed. Erlang hot-reload and at-runtime-code-gen from other langs come to mind. But that will almost definitely slow down your program because of the new infrastructure your code has to deal with.
I have observed that there can be a nice balance with Java and the auto-reloading tools that are available for it. But I am unaware of their limitations and how a web browser might trigger those limitations.
D is a language where you can have both. C++ architected correctly can get much closer though. Much larger compilation units are a start. After that, realizing that modularity comes from data formats and protocols means you can start to think about minimal coupling between pieces. I think dynamic libraries for development are very under utilized.
Besides those people already pointed (and Haskell - yep, the always slow GHC can do that), there's no reason C++ couldn't have both (except from large templates).
Rust doesn't compile very quickly at the moment. Helpfully, there's a live thread about the matter on r/rust [1]. Broadly speaking, it's about the same as C++. Some aspects are faster, some slower. Points worth noting (from that thread and elsewhere):
* Everything up to and including typechecking (and borrowchecking) takes a third to a half of the time, with lowering from there to a binary taking the rest of the time; that means (a) you can get a 2-3x speedup if you only need to check the code is compilable, and (b) overall speed isn't likely to improve a lot unless LLVM gets a lot faster.
* Rust doesn't currently do good incremental compilation, so there are potential big wins for day-to-day use there.
* There is a mad plan to do debug builds (unoptimised, fast, for minute-to-minute development) using a different compiler backend, Cretonne [2]. If that ever happens, it could be much, much faster.
> You can have both. It's okay if a fully optimized release mode binary takes a bit longer to compile, but compiling a few million lines of code for a debug build shouldn't take more than a second or two.
It's not difficult to spit out machine code at a high pace. TCC is one example given by the grandparent, but it's certainly not the only fast compiler out there. Languages like Turbo Pascal were designed for rapid single-pass compilation, way back in the 80s.
A million lines of code represents an AST with a few million nodes in it, which compiles to a binary of a few megabytes. To do this we have computers with a dozen cores running at 4ghz each, 100GB of memory and blazingly fast SSD drives.
It's easy to forget, but computers themselves aren't slow. The software we write is just inefficient.
What you are saying is that it is possible in theory, but no one has done it yet. So in some future were someone rewrites all C++ compilers to not be so slow then we won't need to compromise.
Most C++ devs have to work with tools that currently exist and so we are stuck with what the compiler devs give us. Believe it or not C++ compiler devs are pretty smart people and have largely optimized it as much as possible without a language redesign.
That language redesign is in the works with modules, but the dust hasn't settled yet, so that is also a discussion for the future. In the mean time no other language delivers the performance C++ does right now. So if I want to ship product right now the very real dilemma is fast product with slow builds (and a bunch of tools for dealing with that) with C++ or use some other language for a faster compiler and a slower product.
Then there is Rust, but that is another whole can of worms and not in use in most shops yet (Just switching to something has a huge cost).
C++ is a language that's incredibly hard to compile efficiently and incrementally, because it suffers from header file explosion (among other things) and as you mentioned it has no working module system.
C compiles a _lot_ faster than C++, so that's always an option. And as other people have pointed out, you can get C++ code to compile much more quickly by being very disciplined about what features you use and how your code is laid out.
So I agree that if you want to ship something right now all your options have significant downsides. I think software engineers as an industry don't take tooling nearly as seriously as they should. Tools are performance amplifiers and we currently waste a staggering number of manhours working with poorly designed, unreliable, poorly documented and agonizingly slow tools.
Single pass generally means one crack at each compilation unit. it's ok to keep a list of unresolved forward references and go back and inject (fix-up) the address once it's known. I mean, that would still count as a single pass. If they're not in the same file, the linker does it.
Are we speaking in a general sense here? Because the root of this is headers needing to be compiled with (almost) every use in C++. We could get rid of that while maintaining the same functionality. It's not a very bold claim unless there's a requirement of "no significant language changes"
Where did I write "who cares about performance"? And why do you think any of what I said is going to cost 2x-3x performance? Performance has been either a major part of or simply my entire job for most of my career, and I usually make projects I run into at least an order of magnitude faster. For example by switching a project from pure C to Objective-C. Or ditching SQLite despite the fact that it's super-optimized. Or by turning a 12+ machine distributed system into a single JAR running on a single box.
The Web browser and WWW were invented on a NeXT with Objective-C. It wasn't just a browser, but also an editor. In ~5KLOC written in a couple of months by a single person. NCSA Mosaic took a team of 5 a year and was 100KLOC in C++. No editing. So pure code-size is also a problem. And of course these days code size has a significant performance impact all by itself, but also 20x the code in C++ is going to take a significantly longer time to compile.
In terms of performance, the myth that you need to use something like C++ for the entire project is just that: a myth. First, the entire codebase doesn't need to have the same performance levels, a lot of code is pretty cold and won't have measurable impact on performance, especially if you have good/hi-perf components to interact with. See 97:3 and "The End of Optimizing Compilers". Or my "BookLightning" imposition program, which has its core imposition routine written in Objective-Smalltalk, probably one of the slowest languages currently in existence. Yet it beats Apple's CoreGraphics (written in C and heavily optimized) at the similar task of n-up printing by orders of magnitude.
Second, time lost waiting for the compiler is not "convenience", it is productivity. If you get done more quickly, you have more time to spend on optimizing the parts of the program that really matter, and thoughtful optimization tends to have a much larger impact on performance than thoughtless performance. The idea that this is purely a language thing is naive. See, for example, https://www.youtube.com/watch?v=kHG_zw75SjE
Third, you don't need to have C++ style compilers and features to have a language that has fast code, see for example Turbo Pascal mentioned in other comments. When TP came out, we had a Pascal compiler running on our PDP-11 that used something like 4-5 passes and took ages to compile code. TP was essentially instantaneous, so fast that our CS teacher just kept hitting that compile key just for the joy of watching it do its thing. It also produced really fast code.
Point taken, but my point was more about pragmatism. We know it's possible to have a fast compiler that generates fast code (indeed every time this is discussed someone brings up TP). But it's no use talking about a 30 year old compiler or about how the first WWW browser was superbly written. What I mean is I'm not talking about possible, I'm talking about feasible now. If I want to write a performance critical project right now what tool(s) should I use? The answer is most likely C++.
Furthermore, using a "slow" language for big parts of the code can make the whole project faster: size matters as an input into performance. A compact bytecode executed by a small interpreter thrashes the cache and memory hierarchy a lot less than tons and tons of ahead-of-time-compiled native code.
Use high-level interpreted languages to make your life easier, but also use them to make the page cache's life easier.
A good example of the architectures you are describing are Android and UWP.
Although the lower levels are written in a mix of C and C++, the OS Frameworks are explicitly designed for Java, C# and VB.NET.
Trying to use C or C++ for anything more than moving pixels or audio around is a world of pain.
The Android team even dropped the idea of using C++ APIs on Brillo and instead brought the Android stack, with ability to write user space drivers in Java (!).
> Would you be okay with your browser being 2x or 3x slower?
No, I will switch to firefox or something. I'm a user, I don't care how hard it is for developers I can about my workflow which is using a browser on various machines, some of which are very slow.
Of the developers I've met recently, this wouldn't surprise me at all. The world revolves around them. Not the product. Not the user. Not the company. They are a "developer" or worse, an "engineer" and can do no wrong.
It's worthwhile to note that Blink (v. the whole of Chromium) has had its build time quintuple in the past four years or so, and its starting point was far slower than Presto (which may or may not qualify as "fast" in people's books, depending on what features you care about).
I didn't say I know how to do it. And I didn't say it's easy. Yes, currently we have fast browser and slow compilation. Just let's not assume situation can't be improved at all.
Straightforward integration with existing tooling was not a "mistake", it was a design point. There were plenty of competing runtimes even in the 80's that were better than C's linker model. C++ succeeded because it didn't create that friction.
Sure, but it wasn't a "mistake". Stroustrup absolutely wanted a C with classes, and that meant tight integration with C toolchains. Symbol mangling was the clever idea invented to implement that very deliberate choice, not a fortuitous happenstance.
I'm pretty sure @pjmlp is using the word "mistake" to say "a decision that turned out to be bad". English is not my native language, but judging by the ways I've seen it used and what dictionary definitions I can find, it seems quite acceptable.
Stroustrup made the decision on purpose and consciously, but it turned out to have disastrous effects.
OCaml uses C's linker model, and yet still manages to have working Modula-like modules (even with cross-module inlining). So there's an existence proof that it's possible to do it well.
1. OCaml generates additional information that it stores in .cmi/.cmx files.
2. OCaml does not allow for mutual dependencies between modules, even in the linking stage. Object files must be provided in topologically sorted order to the linker.
3. OCaml supports shared generics, which cuts down on the amount of code replication (at the expense of requiring additional boxing and tagged integers in order to have a uniform data representation).
> 1. OCaml generates additional information that it stores in .cmi/.cmx files.
On this point I'd say that it could probably embed the cmx file as "NOTE" sections in the ELF object files, but likely they didn't do it that way because it's easier to make it work cross-platform. Every "pre-compiled header" system I've seen generates some kind of extra file of compiled data which you have to manage, so I don't think this is a roadblock.
> 2. OCaml does not allow for mutual dependencies between modules, even in the linking stage. Object files must be provided in topologically sorted order to the linker.
I believe this is to do with the language rather than to do with modules? For safety reasons, OCaml doesn't allow uninitialized data to exist.
Although (and I say this as someone who likes OCaml) it does sometimes produce contortions where you have to split a natural module in order to satisfy the dependency requirement. I've long said that OCaml needs a better system for hierarchical modules and hiding submodules (better than functors, which are obscure for most programmers).
> 3. [...] at the expense of requiring additional boxing and tagged integers [...]
> I believe this is to do with the language rather than to do with modules?
Both, sort of. The problem is that mutually recursive modules are tricky. So, it's a limitation of the language, but one that is there for a reason.
> I think this is fixed by OCaml GADTs
No, GADTs solve a different problem. Essentially, normal ADTs lose type information (due to runtime polymorphism). GADTs give you compile time polymorphism, so the compiler can track which variant a given expression uses. Consider this:
# type t = Int of int | String of string;;
type t = Int of int | String of string
# [ Int 1; String "x" ];;
- : t list = [Int 1; String "x"]
# type _ t = Int: int -> int t | String: string -> string t;;
type _ t = Int : int -> int t | String : string -> string t
# [ Int 1; String "x" ];;
Error: This expression has type string t
but an expression was expected of type int t
Type string is not compatible with type int
The problem with functors (and also type parameters) is the following. Assume that you have a functor such as:
module F(S: sig type t val f: t -> t end) = struct ... end
To avoid code duplication, F has to pass arguments to S.f using the same stack layout, regardless of whether it's (say) a float, an int, or a list. This means that floats need to get boxed (so that they use the same memory layout) and integers have to be tagged (because the GC can't tell from the stack frame what the type of the value is).
The cmx data could be converted to ELF note sections, but the whole thing has to work on Windows as well, so I guess they didn't want to depend on ELF.
In most projects, you can add this to your Makefile and forget about it:
My guess: The trick is not to have "template instantiation" but "module instantiation" (aka Functors in OCaml). Now you can instantiate only once. For example, if the compiler encounters "List<Foo>", it would instantiate into the "List$Foo.o" file or not if it already exists. Java works similarly, except the files have the extension "class" instead of "o".
More generally speaking: The trick must be to not generate identical instantiations multiple times. So you must have a way to check, if you already generated it. Of course, the devil is in the details (e.g. is equivalence on the syntactic level enough?).
You are focusing only on generics and missing all the module metadata and related module type information.
In module based languages, the symbol table is expected to be stored on the binary, either to be directly used by tools or to generate human readable formats (.mli).
So if one uses the system C linker, it means being constrained to the file format used by such linker.
Modules would be very hard to make work in C++, there's way too much entanglement at all levels. Of course the rot actually started with the ANSI C committee when it introduced typedef and broke context free parsing. C++ just compounds this kind of problem with template lookup stupidity. Its what you get when languages are designed by people with no understanding of basic computer science.
I used to be a C++ guy for 20 years, but won't go back unless absolutely necessary. I mean, it's tempting -- there are some cool new language features. I find that the abstractions are leaky, though, so you still have to understand all the hairy edge-cases. The language has gone insane. I'm 40 -- I want to get stuff done before I die, not play clever games to get around my language / system.
C++ has been by next loved language after Turbo Pascal, since then I learned and used countless languages, but C++ was always on the "if you can only pick 5" kind of list.
Since 2006 I am mostly a Java/.NET languages guy, but still keep C++ on that list.
Mostly because I won't use C unless obliged to do so, and all languages intended to be "a better C++" still haven't proved themselves on the type of work we do, thus decreasing our productivity.
Because in spite of Swift, Java, .NET and JavaScript, C++ is the best supported option from OS vendors SDKs.
I dream of the day I could have an OpenJDK with AOT compilation to native code with support for value types, or a .NET Native that can target any OS instead of UWP apps.
Until then C++ it is, but only for those little things requiring low level systems code.
I've given up on C++ 5 years ago, after 10 years of getting paid to develop it. I find out the extra money that come from a C++ job doesn't cover the gray hairs of trying to tame the language so you don't shoot yourself in the foot a dozen times every time you call a method.
Go is a good example of a language with fast compilation times. Of course, the optimizer needs improvements, but I believe they keep managing to make it faster as they improve the output rather than slower.
Go's compilation times are fast, but it also took a significant dive in 1.5, when they rewrote the compiler in Go.
They're slowly improving it to return to pre-1.5 performance, but last I checked, it wasn't there yet. The impact is insignificant on small projects, of course, but easily felt on larger (100Kloc+) ones.
While the recent optimizer improvements are great, my wish is for Go to switch to an architecture that uses LLVM as the backend, in order to leverage that project's considerable optimizer and code generator work. I don't know if this would be possible while at the same time retaining the current compilation speed, however.
The important point about Go in this case is that it's fundamentally more efficient because it has real modules and can do incremental compilation.
Sometimes people don't realize this because they always use `go build` which, as the result of a design flaw, discards the incremental objects. When you use `go install` (or `go build -i`) each subsequent build is super fast.
Huh, really? Why is that? Non-incremental builds shouldn't be needed at all, but besides that, just based on the names I'd expect `go build` to be the cheap one and `go install` to be the expensive one.
It's unfortunately not a well-known feature. The Go extension to VSCode was using "go build" (without "-i") for a long time, and if you're working on something big like Kubernetes, it's almost impossible to work with.
The annoying thing is that "go install" also installs binaries if you run it against a "main" package. I believe the only way to build incrementally for all cases without installing binaries is to use "-o /dev/null" in the main case.
Wasn't that Wirth's rule for Oberon and/or Pascal? Any optimization introduced has to have sufficient cost vs. benefit ratio that it makes the compiler faster compiling itself.
In Go's case, I think they're simply improving a lot of unrelated aspects of the compiler while adding new optimizations. I like the idea of that rule but there are definitely cases where I would want an optimization that could take a long time during compilation but provide an immense benefit later.
Ah yes, I remember that rule, I think it's incredibly clever. I believe he had another rule, language updates can only /strip/ features, so the core language will always get smaller and smaller.
Intuitively, that doesn't seem clever to me. (You're committing yourself to a less efficient, more clumsy language for the benefit of... what exactly?)
How does that law improve the language without falling into the trap mentioned elsewhere in this thread? (Optimizing for a pleasant "compile experience" at the cost of everything else)
Well, the rule is attributed to Niklaus Wirth, whose credits include Modula, Modula-2, Oberon, Oberon-2, and Oberon-7. The -* languages are extensions of their originals, so it seems his rule only applies within a single edition of the language. They can add new things, because they are new languages.
The justification, then, seems to be that if you legitimately need new features, then the language has failed and you should start over anyway. I think Python 3 is sort of an offshoot of this idea, except that many of the new features keep getting backported to 2.7 anyway.
Given that Oberon-7 is a subset of Oberon, reducing it to the essential of a type safe systems programming language, I wouldn't consider it an extension. :)
There is also Active Oberon and Component Pascal, but he wasn't directly involved.
My mistake, I'm not intimately familiar with it. From what I can tell, Modula-2 and Oberon-2 were both extensions, though, to be used as successors to the previous language.
This whole sub-thread tangent-ed into C++ is bad because it's compilation causes this problem. Problems with C++ builds are well understood...
This specific issue prob impact other batch workloads with lots of small tasks (processes). There's no reason this should be happening on a 24 core machine.
I'm not sure if the approach of Lisp and Jai (and also D) is the best one. They all have practically arbitrary code execution at compile time, so they can be arbitrarily slow to compile. In C++ template-programming is so hard that few people do it, but in those languages it is just as easy as normal code.
With mainstream languages, code generation is done by the build system which can avoid repetition. Caching generated code feels like a good idea to me. Doing it with compile time execution is (unnecessarily?) hard.
It seems that, in Jai, arbitrary code execution at compile time is exactly the point. The build file itself is just another Jai program. It's easy to wring one's hands about a novice programmer making the computer do unnecessary work, but Blow's philosophy seems to be to trust the programmer to understand the code they're writing. And if they don't, they should probably be using another language.
Compiling Common Lisp code isn't too slow. There are some quick compilers like Clozure CL.
What slows some Common Lisp native code compilers down is more advanced optimization: type inference, type propagation etc, lots of optimization rules, style checking, code inlining, etc.
Yes, and I mean "we" as in "this industry".
I just recently talked to someone whose Swift framework(s) were compiling at roughly 16 lines per second. Spurred on by Jonathan Blow's musings on compile times for Jai[1], I started tinkering with tcc a little. It compiled a generated 200KLOC file (lots of small functions) in around 200 ms.
Then there are Smalltalk and Lisp systems that are always up and pretty much only ever compile the current method.
We also used to have true separate compilation, but that appears to be falling out of favor.
Of course none of these are C++ and they also don't optimize as well etc. Yet, how much of the code really needs to be optimized that well? And how much of that code needs to be recompiled that much?
So we know how to solve this in principle, we just aren't putting the pieces together.
[1] https://www.youtube.com/watch?v=14zlJ98gJKA He mentioned that a decent size game should compile either instantly or in a couple of seconds