Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
GCC Rust Approved by GCC Steering Committee (gcc.gnu.org)
436 points by edelsohn on July 11, 2022 | hide | past | favorite | 288 comments


This is great! Gcc support opens up more platforms, more targets, and the opportunity to more easily integrate with the various common embedded toolchains that are built around gcc.

And it's free software, for those that care, this matters.


The rust compiler can already use GCC as a backend. So I don't see how this opens more platforms than that.


Because that doesn't use the GCC frontend interface, requiring build tools and embedded toolchains to be modified to understand the rust compiler interface. By using GCC it's just another language that the existing toolchain can understand.


The Rust module system is radically different from C and C++ and other similar languages in the embedded space.

Every build system that has added support for Rust, which aren't many, had to be radically modified to achieve that.

None of these supports the GCC Rust frontend, but all of them support the Rust frontend.

So if you actually wanted to build any >100 LOC Rust project for embedded targets not supported by LLVM, doing it with the Rust frontend is as easy as just running 1 CLI command to pick its GCC backend.

Doing it with the GCC frontend, would require you to either port one of the build systems to support it, or... give the GCC frontend a CLI API that's 100% compatible with the Rust frontend.


Cargo support for gccrs is part of this project:

https://github.com/Rust-GCC/cargo-gccrs

Moreover, modules are less interesting to me in embedded development, for which I'm interested in access to Rust's borrow checker for gaining certainty of small portions of larger projects, which are written in other languages.


So it's not about platform support, but about toolchain integration? Who benefits from that, projects using C/C++ who want to use a rust library? Or is it about distro package maintainers?


The toolchains support helps embedded developers, mainly. For example, Xtensa and AVR toolchains are generally byzantine monstrosities of makefiles, Python, dialog, etc; so having them be given a low effort means to consume rust is a boon. Ideally, rust is just another source file in the srcdir soup.

That said, gcc supports more platforms than llvm; including esoteric and unpopular desktop configurations.

Personally, I plan to drop this into marsdev as soon as it releases. Writing 32X games in rust sounds like silly fun.


Incidentally, Xtensa has hired someone for the last… year or so? To make using Rust on their stuff work well.

AVR support is almost there in mainline rustc but has a codegen bug or two, last I heard.


I think you meant Espressif, Steve. They have a fork of LLVM with Xtensa support they’re looking to upstream (still a few things missing, like the DSP/AI instructions in ESP32-S3, and I think the codegen is better on GCC for now). And the folks at esp-rs who work at ESP and outside contributors maintain a Rust toolchain and standard library (based on ESP-IDF) which they also want to upstream. There’s also a baremetal target which has a dedicated developer in ESP, it’s pretty amazing. Although esp32-s3 is going to be the last Xtensa chip from them, they’re planning on moving to RISC V, wholesale, with all their products in the last year based on it. Ferrous Systems even designed a Rust specific devkit based on their RISC V esp32-c3, to teach embedded Rust on.

I bet Cadence was ripping them off for the IP, which is a shame…

Any talks in Oxide about porting Hubris to RISC V? I hear getting your hands on Cortex-M*s in bulk is still pretty challenging these days.


Ahh whoops you’re right, lol. Embarrassing.

Hubris was designed to be easy to port to RISC-V, so yes! We didn’t end up doing that though. Someone else did though! https://github.com/oxidecomputer/hubris/discussions/365


> free software

it seems like this term is misunderstood in English to mean "no money" .. perhaps "Libre Software" is a better starting description here


The term has been around for decades now, and seems to be fairly well understood in the tech world. It doesn't take much to say "free as in speech, not beer" to the few people left who need to understand it.


I'm not so sure about that; I've certainly met plenty of people who did not understand it. HN is not representative of the wider tech community.

And "free as in speech, not beer" actually does little to clarify it IMHO; it only works well if you're already familiar with the concept, but if you're not it only adds to the confusion (free speech is about the freedom to say what you want, so free software is about being able to write whatever software you want?)

Now, "free software as in right to repair" would actually clarify it, but trying to get an understandable message out seems to be "unethical" so meh :-/


There are probably even fewer people that know what right to repair is.

Free software just isn’t enough anymore because the licenses limit your freedom too. (and even my choice of the word “limit” here will be controversial between the Apache/MIT and the *GPL crowds).

Don’t have an easy answer :-)


Right to repair has been all over the news, including mainstream news, and least in tech circles it's fairly well known. And while there are undoubtable many people who don't know what it means, the basic idea is quite obvious from just those three words: "if I bought something, I should have the ability to repair it".

The problem is that merely right to repair doesn't cover all of Free Software so it will never catch on with the hard-core FSF crowd who seem to think that the only possible step is one giant leap to a Free Software utopia and than any smaller step in that direction is "unethical".


Do many/most people really think they should have the ability to repair it? I'm not sure because they keep buying products where it's expressly not possible to repair it yourself or by a competent 3rd party.

So I'm not really sure it has any more meaning than free software means no cost rather than "no one and everyone owns the software".


Before micro-USB every cell phone came with its own charging connector plug and everyone and their mother was complaining about it, but people still bought them because there wasn't all that much of a choice, or the sacrifices you had to make in other areas were just too large.

Price, functionality, availability, etc. all factor in. And the tricky thing with repairs is that it's a very "non-obvious" feature.


> Do many/most people really think they should have the ability to repair it?

“Ability” is the wrong word here, why did you use that word? It’s ambiguous and misleading, that’s why it’s called the right to repair.

Yes most people do assume they have the right to modify and fix something they “own”, whether it’s a car or a computer or anything else. Don’t you think it’s surprising and non-obvious that you might not be allowed to repair something you paid for and own, even when you know how to repair it? Whether they ever intend to repair something, and whether they have the knowledge to repair something, are both irrelevant to whether they have the legal right to repair something.

> they keep buying products where it’s expressly not possible to repair it yourself or by a competent 3rd party.

Most people have no intention of repairing their technical purchases themselves, and that’s perfectly fine. Most people also don’t care about right to repair laws even if it affects them. The point of right to repair laws is to establish the common sense standard that consumers are allowed to legally modify their purchases, and companies will no longer be allowed to go out of their way to prevent repairs or make them difficult (also and perhaps especially wrt 3rd parties), and also that companies and lawmakers will no longer be able to make customizing and repairing purchased products illegal as has been done in the past.


this question is exactly why I suggested we revisit that terminology in this thread!


>Right to repair has been all over the news, including mainstream news, and least in tech circles

Has been in the kind of news Joe Average doesn't give a fuck about and even if they watch, they forget in 10 minutes.

Techies spot mentions of it in the news because they already care for it and know the term.

That's not the same as something being in the news that reach regular people (that would be more like some high profile Hollywood divorse, some war, or gas prices).


Such hostile contempt... Lots of things to concern yourself about and not everyone is interested in everything which is fine, but it's gotten quite a bit of coverage and certainly a lot more people know about it, never mind laws are being discussed and actually passed. More importantly, explaining the concept is just a lot easier because it's significantly less abstract.


>Such hostile contempt...

No such thing. I'm fine with Joe Average, and I don't think everybody should know about everything, much less about the "right to repair" (much more important stuff for people to learn about, including politics and/or gas prices and other things that affect someone more).

I'm merely stating a fact: the "right to repair" is not something that has been covered in any degree for the regular persons to know or care about. Most of those that do are already familiar with the tech scene, if not as technies, then as gadget lovers and tinkerers.


>I'm not so sure about that; I've certainly met plenty of people who did not understand it. HN is not representative of the wider tech community.

Well, people understand it just fine.

It's just that the definitive allure of FOSS is not what the purists think it is.

purist: FOSS is about freedom (to work on the source, release changs as FOSS, etc)

real world: what I care about is 70% the code being available to use for free and 30% the ability for others (and me in some cases) to edit it and release my changes. Yeah, I do understand it's not the same as mere "freeware". But for most intents and purposes that the part I care most about, but I also like the FOSS aspect because it means I can get more community stuff for it, and I don't depend on a single vendor releasing it.


> HN is not representative of the wider tech community.

But we are on HN right now...


> free as in speech, not beer

I can never remember which way around this is. I think the "beer" is supposed to be free in the monetary sense and "speech" is supposed to be free in the rights/liberty sense, but the analogy doesn't actually convey this:

- Speech is normally both monetarily free and (in certain places) a right

- Beer is normally neither monetarily free or a right. But may make you "feel" free, which is another sense entirely.

Is this a cultural reference I'm missing?


It’s not as complex as you’re choosing to make it. When someone tells you, “There’s free beer at Oktoberfest,” is your first thought that the beer has no monetary cost, or that the beer is unencumbered by particular legal restrictions?

Now someone tells you, “There’s free speech at Oktoberfest.” Is your first thought that there’s no monetary cost to expression, or that the expression is unencumbered by particular legal restrictions?

(Happily, Oktoberfest has both.)

Think of them as the _slightly_ longer "Free as in 'free beer' " and "Free as in 'free speech' ". When beer is described as “free,” it’s understood to be gratis. When speech is described as free, it’s understood to be libre. These are used as examples for more ambiguous situations, since software can be one, the other, both, or neither.


How do I get free beer at Oktoberfest?


You don't, but the point is that if there was "free beer" at Oktoberfest the meaning would be unambiguous.


pssst - there are two different words in German for these concepts


what is free software?


Free as in beer, as when one refers to 'free beer' it's one that you don't pay for. When one says 'free speech' it's one that allows you to do what you want.

Former is gratis, latter is libre.


> Beer is normally neither monetarily free

TBH the main place I've seen the phrase "free beer" is a Simpson's joke: people stand in front of a sign saying "free beer", but after Homer drinks many cups, the sign is revealed in full to say "alcohol-free beer, $5/cup".

It's understood that beer normally costs money, so the setup of the joke is that "free beer" is understood without explanation as "beer without costing money".


Beer was once given away at taverns so people would stick around for the food, which had a price adjusted to make up for the beer. Freeware uses this model.


This also works in the inverse. The kimmelweck sandwich has its orgins in the pubs of western New York. Some gave the sandwich for free, and it was so salty that incentive was created for adding some pints to the tab.


The first amendment to the US Constitution provides very strong protection for the right to free speech. Presumably that is the origin of "free as in speech".



Do you think that the free software movement would have adopted "free as in speech" if the US was hostile to free speech?


Certainly, but it would have started somewhere that wasn't hostile to free speech. The Netherlands 20 years from now, maybe, or Korea when the computer is invented 150 years in the future in the early stages of the Industrial Revolution.


Right, but what about "free as in beer"?


"Free-as-in-speech, not free-as-in-beer" is just expecting you the hearer to understand that while event advertisements might prominently feature the words "free beer" as an exciting statement that beer is being given away for zero cost to drive interest, zero cost is not the sense in which "free" is being used in "free software"


If I buy you a beer, that's free beer for you.


Which, unless you're my friend, prompts the questions "what else is in this beer" and "what's in it for you." I believe that to be a deliberate choice in the "free as in beer" analogy.


Probably you're overthinking it...

"free as in speech, not beer":

"free speech", the right to say what you want without legal restrictions from the state

"free beer", beer you don't need to pay for

Nothing to do with beer making you "feel free" (?) or whatever

Then again, free speech is not about being able to modify speech, release your own version of another's speech, and so on (e.g. the US has "free speech" but MLK's dream speech is still copyrighted).

Perhaps a better slogan would be:

"Free as in free sex, not beer"

And if the slogan wasn't meant for corporations too, and wasn't coined in the puritan 80s/90s, it might have been...


Maybe something more comprehendable would be “free as in Postfix, not Gmail” in that anyone can freely implement an MTA or use existing FOSS, which is different from using for free Gmail, Facebook, first 3 months of AOL, first 10 albums from Columbia House, …

https://www.mentalfloss.com/article/28036/its-steal-how-colu...


The culture reference is to commercial goods (like beer) being given away at zero cost as a promotion. (“Beer” is probably not the best example but is conveniently terse; for Boomers, “toasters”—in reference to bank promotions—might be more culturally relevant, for younger generations “-to-play” in reference to F2P games might be, but neither rolls off the tongue as well.)


> The term [Free Software] has been around for decades now, and seems to be fairly well understood in the tech world.

For you and me, maybe. But no, Free Software is often misunderstood. I had to explain the difference on a group dedicated to Linux the other day.

It is worth pointing out, every time.


I have never understood why it was not called "Liberty Software" since the beginning. The confusion disappears, liberty exists in English as a sibling to "libre" from Spanish and French and everybody in the United States, where the term originally comes from, and the rest of the English-speaking world understands what it means. As for the grammar argument, you can--sort of--force `Liberty` to be an adjective (think "Liberty City").


IMO, Liberty Software sounds like a company or a brand, but free/libre are philosophical concepts. I think it's important that free/libre software be noted for it's philosophy, not it's affiliation to a specific organization.


"free software" is still misleading and "free as in speech" is hardly accurate.

They both fail to convey what it's really about.


GNU is pretty clear:

To understand the concept, you should think of “free” as in “free speech,” not as in “free beer.” We sometimes call it “libre software,” borrowing the French or Spanish word for “free” as in freedom, to show we do not mean the software is gratis.


It might seem clear to those of us who’ve heard the whole story, but free speech isn’t a great analogy at all, which means this will always require a lengthy explanation to someone new to the concept. Free speech is a constitutionally protected right in the US (and something else in other countries) that comes with a list of limitations and exceptions, while “free software” is not in general a legally protected right, and in GNU’s implementation requires a viral license that eliminates some kinds of freedom, e.g., it’s specifically and explicitly not public domain. It’s fine to clarify that “free” doesn’t mean money in this case, but that’s hardly a clear explanation of what it does mean.


You'd need to be working at Discord or something to not know that Free Software is a thing.


Maybe but "free" should be related to "freedom" more than "not having money." "You're free to use the park," "you're free to use the library." We need more freedom in that sense.


"Libre" is not a word in the English language. Pretty sure that's not going to be clearer.


feature not a bug? asking a question is the first step to a new understanding


You’re assuming genuine curiosity which probably isn’t there.


Maybe for Americans who cannot see further then their noses or that don’t speak any other language


American companies funded by leveraged debt are specifically, exactly paying that money in large amounts to remove any GPL'ish things in their stack. Some other hardware companies are simply failing to comply by the license - catch me if you can, I guess. If so much money is available as "someone else's money" and there is some vague notion that ownership complete is the mission of profit-seeking .. then why not? but what does that do to the value of the work, and the value of the ecosystem, built as GPL?

GCC is an engineering marvel, and the basis of two generations of engineering by hundreds of people and companies. Rust joining one of the many languages supported by GCC and stack, is welcome here.


I have mixed feelings about the idea of a GCC frontend for Rust.

On one hand, having frontend diversity for a language helps bring new people into the language; think of the groups who can't use Rust because it doesn't support certain targets or can't integrate with their existing toolchain. There is a lot of talent that could be brought into Rust just by virtue of them being able to be included and the less barriers of entry, the better.

But on the other hand, I don't want to have to think of implementation specific behaviors, bugs, quirks, et cetera. Having an ecosystem built with thousands of people wrestling with that problem seems like a recipe for buggy software and burnout. People say that a spec would help with this aspect but I'm sceptical; a spec doesn't prevent divergence from happening, it just gives you a frame of reference for what is correct. You also don't need multiple implementations to have a spec either!

I fear that the way this is going to pan out is that we'll have multiple frontends for Rust, but the vast majority of engineers using the language will only ever think about "true" Rust: the current mainstream implementation. Issues in crates that primarily affect "alternative" Rusts will go unacknowledged, be tossed out, or have hacky patch jobs. We'll end up with either a) two separate, but high quality, ecosystems or b) one shared, lower quality ecosystem that is constantly fighting itself. I hold Rust in really high regard and it's my favorite language at the moment, so this thought is scary to me.

Though, there are a lot of smart people involved so maybe it doesn't have to be so doom and gloom. A couple of years ago I thought that cross-platform software was really messy and hard to get right but languages like Rust have the proper language features to make things like this much easier (though not perfect!). It could be that the emergence of multiple frontends could necessitate features and tooling for the Rust itself which makes all of that I said a non-issue. Maybe. Hopefully.



Pretty sure you have it backwards. GCC isn't going to be a front end for rust; rust is going to be a front end for GCC. As I understand it, the point of both GCC and LLVM are to handle those implementation specific behaviors so programming language developers don't have to handle that concern. I don't expect this to fracture the language, but simply to provide the ability to use the language in more places.


Sorry, to be clear, I'm referring to the CPP reimplementation of the Rust compiler that uses GCC as a backend (gccrs).

My concern with fragmentation is that this is essentially a reimplementation of Rust in another language (C++) targeting a different backend (GCC). It's only natural for there to be differences between the two, especially over a long period of time.

There is a separate initiative that may be more in line with what you're thinking which adds GCC backend support for the existing mainstream Rust compiler.


I think what will prove more important than standards will be a common suite of tests used by all implementations. Sun used to have TCK[1] that was used by a few implementations.

[1]: https://en.m.wikipedia.org/wiki/Technology_Compatibility_Kit


There are two GCC ports for Rust being developed at once: codegen_backend_gcc is a module in the reference implementation of rustc, while rust-gcc is a complete C++-based compiler: https://github.com/Rust-GCC/gccrs/wiki/Frequently-Asked-Ques...


Another related project is the libgccjit backend for rustc:

https://github.com/rust-lang/rustc_codegen_gcc


Lots of people are spending time with libgccjit for compiling Emacs modules to speed them up with the "native compiled" version of emacs. I've tried it with doom emacs and it definitely speeds up things and seems stable enough.


What is the benefit of having multiple compilers for programming languages? Is there a scenario where a GCC compiled rust program would do something that an LLVM one can't do?

Doesn't this cause fragmentation in the rust ecosystem?

P.S.:I understand that people can work on any project they want. And I don't have the right to tell them not to. I'm just curious about the technical reasons for having multiple compilers.


1. GCC has more backends than LLVM. 2. Competition is good in general. 3. I expect this will trigger inconsistencies between GCC and rustc; because Rust doesn't really have a specification. Which will force both parties to discuss and solve them.


Being on gcc, a long-lived platform, also helps ensure the survival of the language even if development of the current compiler (or LLVM) dies or withers.


Does it? GCC's Java frontend died and is no longer shipped, they need maintainers like any other compiler.


GCJ is no more? It was being used within recent memory for things, i thought.

I am out of the loop though, so if this is true, that's interesting and a bit weird.


Since 2009 actually.

Most contributors eventually moved into OpenJDK after it became available.

GCC folks left it around for a couple of years, because GCJ unit tests exercised parts of the compiler no one else did.

Eventually they decided it wasn't worth that maintenance cost to keep it around only for that purpose.


GCC support of Objective-C is very very poor.


It is at the level NeXT was forced to contribute back to upstream.


Reimplementing ObjC 2.0 would be very hard. You have to be precisely bug-compatible with Clang to implement ARC correctly.


and to back up your point... There should be at least 2 implementations for anything to be a spec/standard.


Thank you so much. The specification point is very important


> I expect this will trigger inconsistencies between GCC and rustc; because Rust doesn't really have a specification. Which will force both parties to discuss and solve them.

More likely that GCC has to follow all the bugs and quirks of rustc or no people will use GCC for Rust.


> 1. GCC has more backends than LLVM

Did you mean target platforms? If so, how is this not already addressed by the rustc gcc (Via libgccjit) backend?


One advantage is it forces the language to articulate standards instead of the implementation defining the feature set. Standards tend to give stability and longevity to the language, as well as making it possible to write new compilers and make it more portable.


Ah, a Lisp user. Common Lisp is Exhibit A for standardization. Every Lisp user claims it is great because of either standardization of advanced features in the days when the Berlin Wall has barely fallen or the mere existence of macros. No real first-party improvement to the language in almost three decades after ANSI standardisation. Massive fragmentation in the compiler ecosystem, rarely do libraries work out of the box on non-SBCL tooling. Yes, I can definitely see the advantage of standardization now, very much so.


> No real first-party improvement to the language in almost three decades

That sounds like a benefit to me :)

However, I think that's due to the general lack of interest in Lisp. You can see the C++ community has a similar ANSI standard and updates it every few years.

> Massive fragmentation in the compiler ecosystem

I wouldn't call it massive. They are pretty consistent, up until things like POSIX and FFI APIs. Let's agree there is some fragmentation. Isn't this still a better situation than if nothing was guaranteed?


Yup I have had the same exact experience with Common Lisp and every time somebody talks about standardization being so great I think back on this.

Scheme is suffering from the same issues. Scheme is standardized, but implementations end up being incompatible with each other in subtle ways and the level of fragmentation is very painful. Scheme does get some updates unlike CL I guess, but all of the implementations either don't implement the the modern standard, don't have useful extensions for real world programming or are simply immature and don't have enough people working on them to get them into a nice state. In practice it's very difficult to use Scheme for anything non-trivial because of these issues.

I would much rather have no standard at all, and a single high quality implementation that everyone targeted instead of the current mess for both CL and Scheme. Until we get a new dialect that solves these issues Lisp is going to be more or less dead and irrelevant.


> I would much rather have no standard at all, and a single high quality implementation that everyone targeted instead of the current mess for both CL and Scheme.

That would be Chez Scheme [0], maintained actively by Cisco, a company that you may have heard of - who also use the language extensively.

Racket is porting to using Chez, because it is the industry standard, it's performant, and rock solid.

The GNU alternative to Chez is Guile. Emacs can run with Guile, and Guix is built on it. It's got a fairly large community.

Outside of Chez and Guile, there are implementations and communities, but comparatively, they're tiny. Those two are the only big names you need. Like GCC and Clang for C. There are other C compilers. But you only need to know those two.

[0] https://www.scheme.com/


> Racket is porting to using Chez,

I think it already happened in 8.0


Thanks. I'm going to install guile and tinker around with it! So far it looks pretty good.


Racket?


Why do libraries not work across compilers of the compilers are implementing a standard? Are they perhaps not really implementing a standard?


Simply having a document called "a standard" doesn't mean that:

1. the standard covers everything you wished it would cover

2. every implementation implements the standard, with no bugs

3. the standard doesn't itself contain incoherent or contradictory things

Standards are a tool, not magic interoperability sauce.


Hence the existence of standard certification for compilers as business.


Like the Goldman Rep says in The Big Short, "If you offer us free money, we ARE going to take it..."


I bet Ferrocene will also happily take part of that money.


There are simply lacking of such strong requirements in language standards. C/C++ even have the specific "linkage" concept to abstract the binary details under the source form away. And you may know, many libraries are distributed by binaries.

The standards implying binary compatibility rules are about ABI (application binary interface), which usually depend on the ISA (instruction-set architecture) or the OS (if any) being used. You cannot have the unique one once there are multiple ISAs/OSes supported. Even when you only want to rely on some external "exchanging" representations not tied to specific ISAs, there are already plenty of candidates: CLI, JVM, WebAssembly... Plus there are more than one executable (and mostly, runtime loadable) image formats widely used (PE/COFF, ELF, Mach-O ...). You will not have the unique combination, and any attempts to ensure it "work across compilers" in that way will likely finally just add a new instance not fully compatible to existing ones, making it more fragile.


Standards are often incomplete, or full of "implementation specific" behaviour, since AIUI standards often end up catering to implementations, instead of the other way around (For example C/C++ standards, you can read a plethora of blog posts about the experience of people trying to contribute to them, and some of the hurdles are related to how strongly tied to existing implementations they are). That means you can often have 2 "standards compliant" compilers that are wildly different. Another reason is compiler extensions. Sometimes a compiler is "standards compliant", but also implements a superset of the standard (Sometimes by default, sometimes under a flag) which means code gets written for that "superset" instead of according to a "standard" (for example, the linux kernel and gcc extensions to C).


Typically, some library features contain things that require cooperation with a specific compiler to work correctly. Consider something like std::is_standard_layout in C++, or java.lang.Object in Java, or std::panic::catch_unwind in Rust.


They typically do, unless they are wrapping C libraries or making platform specific API calls.


That sounds a little circular. The benefit of alternate compilers is that it makes making alternate compilers easier.

For stability compilers already have a large incentive not to break old programs. For longevity I don't really see how a standard affects it that much. For being more portable you do not need an entirely knew compiler.


Being able to specify a language outside an implementation is extremely useful to prevent hidden logical inconsistencies between different parts of the language, and makes the language more robust.

It also allows people to design new backends (looking at CUDA LLVM backends)by finding out the right abstraction to support performance. For example, implementing a C or C++ compatible CUDA backend required the C++ committee to make changes to the memory model / consistency guarantees of C++ atomics. If C or C++ had only depended on compiler implementation for it, then there would have just been different implementations with different guarantees with no consistencies between them, and no single way to even define why they were different.


Actually it was the other way around, CUDA was fixed to follow C++11 memory model.

There are a couple of CppCon talks on the subject.


Partially correct, even now, SG1 in C++ fixes a lot of things in C++ specification to allow for CUDA like approaches.


The reason it isn't circular is that if there is one implementation, even with good documentation, there will inevitably be lots of corner cases where the implementation does something, but it isn't written down anywhere. Independent implementations will discover many of these issues and they get clarified as part of the standards process.

So you can't really produce a high quality standard with only one implementation. You'll miss important details.


Rust is ,,A language empowering everyone to build reliable and efficient software.'' (from the home page)

Reliability at the extremes (where it may be even life or death situation) requires the developer knowing what the program (s)he is writing exactly expresses in Rust.


>requires the developer knowing what the program (s)he is writing exactly expresses in Rust.

That just need documentation. You don't need a standard for that.


Right now Rust often limits what it does to what is supported by LLVM. An example is the become statement. This is a reserved keyword which will eventually act as a jump to a function without saving a return address, the current stack frame becomes a new one. This is tricky since things like deconstructors need to still work. It is only recently that LLVM supported this well, and Clang did it first. Separate implementations and having a standard or some other form of communicatiom between implementers can help with these delays.

EDIT: Clang's version is the attribute 'musttail,' if anyone is interested.


There's also political/legal considerations: The modern GCC codebase is derived from the egcs codebase, which forked off the original GCC codebase because the developers of GCC at that time didn't want to prioritize faster development speed:

https://gcc.gnu.org/wiki/History

Thus, a new strain of development can attract people who want to do things differently, and reduce tensions all around.

On the legal front, it's unlikely, but sometimes there's legal problems with continuing to use a certain codebase.


Doesn't GCC support more architectures than LLVM? Wasn't that the issue a while back with the Rust dependency that a cryptography module for Python introduced?


Adding a GCC backend for Rust (rust-codegen-gcc) does this already. I do not personally see the point of writing another frontend in C++.


Dependency on an existing Rust compiler during GCC bootstrap process.


You already need an existing C++ compiler to bootstrap GCC though I don't see how this is much different. Plus there is already mrustc as a C++ Rust implementatation specifically designed for bootstrapping.


It is one more dependency, not written in C++, and you don't see the difference?


One big benefit in this case is that bootstrapping gcc is somewhat easier than rustc, and presumably gcc-rust can then be used to compile rustc if needed.


In addition to what others have said, since there is a plan to introduce Rust kernel modules into the Linux kernel, being able to compile with GCC helps avoid dependence on another toolchain, which was something I've seen mentioned as a concern w.r.t. Rust in the kernel.


To be clear, while it's a concern some people on the internet have expressed, it's not an actual problem for landing the current work to get Rust in as a framework for writing drivers.


> What is the benefit of having multiple compilers for programming languages?

I'll give you one, or two, depending on what you'd like to count. At some point in the conceivable future we'll be able to compile some meaningful Rust code base with both compilers and measure; a.) how long it takes to compile and b.) the performance of the compiled code.

Obviously that will induce what it always has: incentive to improve.


>What is the benefit of having multiple compilers for programming languages?

Rust will need a standard.

The main reason why I don't take it seriously is that code written 5 years ago will often not compile today. For a language that pretends to be a systems language that is a non-starter. If you can't guarantee a 40 year shelf life of your code then no one working on systems cares.

People working on systems in the wild don't have the brain power to learn a new tool chain every decade, let alone every year. They are solving real problems and not writing blog posts.


> code written 5 years ago will often not compile today.

citation needed. Yes, there's a few programs that relied on unsound things for which this is true, but that's a relatively small part of the overall amount of code.


Doesn’t every implementation of a new function on a standard type possibly break existing code?

For example if I have a trait Foo with a function bar, and I impl Foo for HashMap, and then a new version of std comes out that has named something HashMap::bar, now every call to my_map.bar() is ambiguous


In that specific case, the inherent impl is preferred, so there’s no ambiguity. however this can still cause a breaking change if the inherent impl has a different type signature than the trait.

And yes, there are tons of things that can subtly break code. That’s why the rust project runs the entire open source ecosystems’ tests as part of the testing process for the compiler. It’s not all of the code in existence, but it’s pretty good at flushing out if something is going to cause disruption or not.

In practice, the experience that the vast majority of users report to us is that they do not experience breakage when upgrading the compiler.


> In that specific case, the inherent impl is preferred, so there’s no ambiguity.

That is even worse. It means your code can silently start doing the wrong thing rather than erroring, if the inherent impl does something different from the trait.

> And yes, there are tons of things that can subtly break code.

This isn't some obscure bug in some deep edge case though, it's a completely normal and common way of using the language (implementing your own traits on foreign types) predictably leading to breakage in an obvious way. I am not sure why it should be called "subtle".

Anyway, given this issue, I think the meme that Rust is backwards-compatible is really oversold. It'd be more honest to frame it as "we hope releases are backwards-compatible, but we like adding new functions to the stdlib, so no promises" rather than marketing BC as a major selling point as is done now.

> In practice, the experience that the vast majority of users report to us is that they do not experience breakage when upgrading the compiler.

It's anecdotal for sure, but I'm personally aware of times when the exact situation I'm describing has happened and caused headaches for people.


>but that's a relatively small part of the overall amount of code.

Yes and?

Systems programming isn't front end JS work where breaking things doesn't matter. It's no surprise that Rust came out of the browser space. Only people who don't take their work seriously could ever think the above is a justification and not a red flag for never using it.

I welcome Rust becoming ossified in GCC so I can build 30 year old code without modification like I can in C. Until then, it's a toy for people with more time than responsibility.


We are still arguing whether to bump the C standard from 89 to 99 where I work. I kinda like the pacing with C.

Or as Wikipedia puts it: "C17 addresses defects in C11 without introducing new language features."


Hopefully you won't have K&R C by the time you adopt C23 then.


It instantly inherits support for all OS's and architectures GCC support plus a lot of optimizations and debug information.


Rust already has a GCC backend that can do all that.

This post is about a new front-end.


A language with one implementation can't really be said to have a specification.

It may have a very detailed accompanying technical documentation of what the implementation is supposed to do, this may be called a specification, but a specification deserves the name with a minimum of two implementations.


This is technically incorrect. A programming language can be designed with specification in mind, even with a formal one (e.g. SML). It is just true that the specification is not likely effectively verified before more than one real implementations landed, if it is not formally verified. (Anyway, verification by testing of existing implementations _is_ the fallback where people cannot afford the cost of formal methods.)


It helps ensure that the language standard is relevant, and it can help separate the standard committee from the compiler developers, helping bring in more interest groups to the table (since they're not entirely beholden to a single compiler dev team).


Having to nominal competitors makes everything good. Like Pepsi/Coke, Republicans/Democrats.


This is answered in various replies to other comments.


Typical desktop Linux is compiled with gcc and even if rust features in gcc lag behind rustc this is still a great step in the direction of one of the larger goals many of us want from rust which is to replace C as a memory safe systems language for OS kernel and embedded projects.

A big part of getting a sane environment to build rust Linux modules is having it easily integrate with the gcc toolchain.

There seems to be some contention around if this is a good idea for rust as a language itself and my counter to that would be that there are billions of people on the planet and the work integrating rust into gcc doesn't detract in any way I can see from the continued development of rust as a language, it just helps to make it more viable for many use cases and therefore increase it's footprint and overall support.


I'm not sure if this is such a good idea at this point in the evolution of Rust? I mean, Go has a comparatively "tame" pace of changes, but gccgo is still always several months behind gc (which refers to the most-used go compiler, not garbage collection). With the higher volume of changes in Rust, lots of features will be unavailable in gccrust for what will probably feel like ages to some...


I think having some diversity in the Rust compiler space is an excellent idea. The rustc monoculture is really bad in the long-term, having multiple compilers allows for creating a much more robust standard (so that implementation bugs doesn't get "baked in" to the language) and benchmarking compile times as well as resulting binaries (in addition to many other benefits).

You're right that GCC Rust will probably be behind rustc for a while, but the sooner you start, the sooner it'll get there. The language isn't gonna "settle down" in that way any time soon, might as well just get going as soon as possible.


> The language isn't gonna "settle down" in that way any time soon

Actually, I'd say the language HAS settled down. Most new releases are stabilizing APIs or general tooling improvements and not actual language changes.

Not to say there aren't outstanding tweaks, adjustments, or improvements to the language that are ongoing, but rather, the target isn't moving nearly as fast as it was when 1.0 or rust 2018 were released.


FWIW, it does appear that at least for now, the project is targeting a version of Rust that is substantially behind the current stable version:

> For some context, my current project plan brings us to November 2022 > where we (unexpected events permitting) should be able to support > valid Rust code targeting Rustc version ~1.40 and reuse libcore,

(current stable rustc is 1.62, 1.40 is from Dec. 19, 2019)

Which is in line with what you're saying about gcc likely being behind initially, of course.

I bet it's easier to go from version 1.40 to 1.6x than from version nil to 1.40, though - you have to start somewhere, and the later you start, the longer till you have a viable alternative.


First of all, let's hope Rust development slows down in the long term as it matures. I think it is already starting as dependence on nightly is less of a thing than it used to be.

Secondly, Rust has strong backwards compatibility guarantees and you can pin your code to a certain edition.

Thirdly, one big use case for gcc would be compiling the Linux kernel that might contain Rust in the future. So it would be enough to support the subset of Rust that gets actually used in the kernel. I would imagine they would be very conservative about which features get used and adopt at a much slower speed.


> So it would be enough to support the subset of Rust that gets actually used in the kernel. I would imagine they would be very conservative about which features get used and adopt at a much slower speed.

For now, it's the opposite: the kernel needs several Rust features which aren't even stable yet (https://github.com/Rust-for-Linux/linux/issues/2). But I agree that, after Rust has been in use in the kernel for a while (and it no longer needs any unstable features), the kernel developers are going to be somewhat conservative about which features are required (but not about which features are used; they probably will use a lot of conditional compilation, like they already do for gcc/clang, see the compiler-*.h files).


I think the difference is that Go is less active than Rust.

It might turn out to be a bad idea. But at this point, gcc needs to try something to stay relevant. This is the first piece of news about gcc that made me go "whoa" in the last couple years.


Does this mean we could have Rust for AVR? Without all the quirks and schrodinbugs of LLVM AVR on Harvard architectures.


In the past month or so, the latest version of rustc/llvm seems to produce correct machine code. It was very rough for a couple in years though.


Oh, I hadn’t seen that. That’s exciting.


Great news! Hopefully this means that Rust will now support GCC's supported architectures [0] in addition to their own quite impressive and continuously improving list [1] [2], including those that are less popular these days. I hope this will directly help issues experienced by the Debian Ports project [3] [4], Gentoo Linux [5] [6], and possibly Alpine Linux [7] and NetBSD [8].

[0] https://gcc.gnu.org/backends.html

[1] https://doc.rust-lang.org/rustc/target-tier-policy.html

[2] https://doc.rust-lang.org/nightly/rustc/platform-support.htm...

[3] https://web.archive.org/web/20220223133124/https://people.gn...

[4] https://lwn.net/Articles/771355/

[5] https://news.ycombinator.com/item?id=26097153

[6] https://news.ycombinator.com/item?id=26203853

[7] https://wiki.alpinelinux.org/wiki/Architecture

[8] https://www.netbsd.org/ports/#ports-by-cpu


That is actually going to happen for rustc itself via the rustc_codegen_gcc project (https://github.com/rust-lang/rustc_codegen_gcc), this post is about a different project that writes a "from scratch" rust frontend/compiler for the GCC project in C++ . They both compile rust code with GCC as a backend, but one uses the normal rustc/cargo tools, and the other is a separate compiler with a separate cargo shim.


The vote of confidence I was waiting for.


Rust newbie that has used part of his summer to get into Rust. Clarify me on some points.

Does that means than (after merged) I can use gcc to compile rust code right? What are the advantages of this versus using the normal rustc compiler?

All the crate/project management would still be done with cargo? It seems to be a tool very entwined with rust itself, so I wonder if someone uses rust without cargo?


I think most people would just use the existing Rust infrastructure; rustc etc. The GCC frontend will be useful in situations where you cannot use that; for platforms that LLVM/rustc do not support, when you want to build rustc from source instead of using the rustc binaries etc.


Why would someone want this? (Honest question)


The main source of interest for GCC Rust is because GCC targets more platforms than LLVM. A partially-overlapping interest is using Rust in Linux: Linux targets non-LLVM platforms, but is also deeply tied to GCC and Rust-for-Linux would have a lower barrier to entry if it doesn't require a second toolchain.

Long-term, it's considered a strong signal for the health and viability of Rust if it's not strictly tied to one implementation.


Yup. We have microcontrollers that we'd like to write rust on, but we can't because llvm doesn't support the architecture.


But that's not an argument for GCC-RS, that's an argument for rustc_codegen_gcc. The purpose of GCC-RS is a) licensing and b) fracturing the community to stabilise the language spec, something I'm very doubtful over for at least a few more years.


A question for curiosity since I don't know much about llvm and specifically the architecture. Does LLVM not support architectures because there is some barrier, or a higher barrier than GCC? And GCC makes it easier to support some architectures or is it just as easy to add the support to LLVM but no one does it because GCC already supports it?


The vendor sometimes does this and GCC is much more standard in the embedded space. And crucially, it's required to be open source.

For example, for arm-none-eabi, ARM provides a GCC toolchain (https://developer.arm.com/Tools%20and%20Software/GNU%20Toolc...) but makes you purchase the LLVM-based one (https://developer.arm.com/Tools%20and%20Software/Arm%20Compi...). Now I'm fairly certain there's an open-source way to target ARM microcontrollers with LLVM but you can't download it from ARM...


That's so strange. Rustup does ship with support for baremetal EABI ARM:

    $ rustup target list | grep arm | grep none
    armebv7r-none-eabi
    armebv7r-none-eabihf
    armv7a-none-eabi
    armv7r-none-eabi
    armv7r-none-eabihf
... which means the publicly-available open source version of LLVM also supports it.


Sure I bet it does, otherwise Rust wouldn't work on perhaps the most common microcontroller target, and I know it does. I don't know if there are secret sauce bits ARM adds to their version, or if there are differences in hardware support or anything like that. (I use gcc on these targets...have never looked into using Clang/LLVM on them).

edit: looking into this more, yes, upstream clang/llvm should work, though it doesn't include a C stdlib for this target (of little import to Rust, presumably), so you'll have to find one (e.g. by taking it from the GNU toolchain) if you want to use a C stdlib on embedded. Who knows what secret sauce armclang adds to clang...


Where's the "purchase" part there? I see a download link, and it seemed to download the toolchain? I don't have any projects lying around to actually try it out, but it looks complete?

I briefly looked at the license and it seems that if you're getting it for free, then its all good, you just get no support.

(I only write Rust on ARM and so am unfamiliar with the various toolchains they offer, honestly.)


Ah, ok I saw this table: https://i.imgur.com/5ofzjC3.png and that trying to download required an account (which is not true for the GCC toolset) and assumed they'd make you pay something. But maybe they just try to trick you into thinking you might have to... (or I am easily fooled).

edit: My legalese is not great, but isn't the the license saying that you can't legally distribute software compiled with the LLVM-based ARM tools if you obtained it for free? Obviously the gcc-based tools can't have such restrictions.

    3.2 NON-COMMERCIAL USE AND FREE OF CHARGE LICENSES: 
     ...
    (b) if you are receiving a Non-Commercial Use License or version (as applicable) of the Arm Tools: 
    (i)  you and your Permitted Users may use the Arm Tools for internal use only; and 
    (ii)  you are not permitted to distribute or sub-license (A) any part of the Arm Tools, or (B) Your Software,
          Your Hardware, or Your Reports developed under this License using the Arm Tools. The Arm Tools shall be
          used only by you and your Permitted Users, and you shall not (except as otherwise authorised in writing by Arm) 
          allow any other third party whatsoever to use the Arm Tools. 

    For the avoidance of doubt, if you are receiving a Non-Commercial Use License and the license is provided to you free of
    charge, the restrictions in both sub-clauses (a) and (b) above will apply to your use of the Arm Tools.

additional edit:

Here's what happens when I try to run armclang:

     $ armclang
     armclang: error: Failed to check out a license.
     The license file could not be found. Check that ARMLMD_LICENSE_FILE is set correctly.
I did not bother to figure out if there's a way to get a license without paying (though I would likely qualify for a non-commercial license as an academic user, but it seems like a pain).


There’s that but but there’s also the “got it for free” thing above that; seems distinct from a “noncommercial license.”

But who knows, honestly they make this stuff as confusing as possible, it seems.


It's a mixture of problems.

GCC targets more platforms than LLVM, that much is true. But beyond that, it's pretty common in the microcontroller world for a vendor to release their own specially patched version of GCC blessed for a given platform. They simply aren't doing that for the LLVM.

Those blessed GCCs aren't often seeing the hacks merged upstream.

Here's one such example:

https://www.ti.com/tool/MSP430-GCC-OPENSOURCE?keyMatch=gcc&a...


Note that for providing more platform support, there is also rustc_codegen_gcc which just replaces LLVM with libgccjust but keeps the rustc frontend.


Competition.

GCC had laughably bad error messages before LLVM caught on. Even then, LLVM generated terrible code compared to GCC. There's a similar competition going on with open source linkers (which are finally going multi-threaded).

Both compiler toolchains currently blow pre-LLVM GCC out of the water (and GCC development has noticeably accelerated since LLVM came out).

I'm not sure which one is better at which C++ thing these days, but I'll bet GCC Rust will beat LLVM Rust at some important things in a few years (and then vice versa).


And now clang is lagging in C++20 support, and modules will come in 2 to 3 years as per roadmap (right after C++23 is done), while VC++ already has them, and GCC head is catching on relatively fast.


Similar story to Vim/Nvim and maybe partly OpenSSL/LibreSSL. Forking (not all technically forking) is good sometimes.


Another angle is that GNU wants this, because they want GCC to stay relevant and competitive. Rising popularity of LLVM-only Rust gives advantage to their competitor.


No worries, it will keep releavant, now that Apple and Google apparently took out their support from clang, and most other vendors that benefit from MIT aren't that keen into pushing C++ changes upstream, clang is getting a nice third place in C++20 support.


Apple Clang is already a completely different beast than upstream, right? At least the version numbers are nonsensical unnecessarily complicating feature support checks in my experience as someone who doesn't own any Macs but writes software that others insist on trying to compile on a Mac...


Yes, hence having its own column on cppreference, or a special flavour of bitcode for watchOS.

In what concerns Apple, I think they mostly care about the C++ support needed to keep LLVM going, the C++14 based dialect for Metal Shading Language, and the subset used across IO and DriverKit.

For everything else there is Objective-C and Swift.

Then Google apparently drop off clang after the ABI break votes didn't went the way they wanted, so they are now focusing on Abseil, and their style guide is anyway quite restrictive.

So now we are in this ironic situation, that VC++ from all compilers is the one with best C++20 support, closely followed by GCC, and then there is clang and the other lesser known ones still lagging in C++17 and earlier.


On the GCC side, there are no plans to stop supporting building applications using Clang and libstdc++ (that is, using the libstdc++ installed headers and the GCC-built libraries/shared objects). Therefore, a question mark next to libc++ doesn't necessarily threaten the long-term viability of Clang as a C++ compiler. (I assume your Abseil comment is actually about libc++.)


My Abseil comment is about where Google now rather sees their employees spending time on C++ libraries for them.

At least from the comments I sometimes see flying by on Reddit.


Yep, the perma-frozen ABI decision sounded the death-knell for C++. Basically the committee committed collective hara-kiri with that decision. Thou Shalt Not Progress!

It's so extraordinarily, mind-bogglingly stupid. But I guess it is time for the ageing C++ queen to pushed out of the mortal coil and let Princess Rust grab her long-overdue system crown.


Princess Rust has no proper throne on the kingdom of binary libraries.


Without ABI change C++ is trapped forever with bad decisions. In some cases these are decisions which were bad twenty years ago and are still bad today but should be fixed, in other cases these are decisions which were probably the right call twenty years ago but now you'd choose differently.

Did you know on MSVC a std::mutex is so huge it needs more than one cache line ? Obviously Microsoft aren't stupid, they know how to fix that... but it would change the ABI so they can't touch it.

And so the Zero Cost Abstraction promise "Don't pay more than it would cost if you did it yourself" becomes "Eh, just do it yourself" everywhere - C++ programmers learn to hand roll all the basic stuff they need, because ABI stability has ensured the standard library mechanisms are slow, or bloated, or both.


Doesn't matter, binary libraries are a big deal for many businesses, regardless of ABI constraints in C and C++ languages and compilers.

Either Rust wants to play on that field, or it doesn't.

Microsoft has broken their ABI plenty of times, and VS vNext might be when the next break will take place, which was initially planned for VS 2022.

Bashing C and C++ ABI issues on HN will do very little for those businesses to adopt source code distributions.


You can write and use binary libraries in Rust, you're just restricted to using the stable C ABI to interface across separately-built components. Spoiler: You'll want to do this anyway, because it's required for any sort of C-compatible FFI regardless of language.


Nope, Windows C++ compilers have supported DLLs since Windows 16 bit days.

Some famous C++ frameworks that ship as binaries are OWL, VCL, FireMonkey, MFC, ATL, WinUI.

And WinRT has improved upon classical COM ABI to provide even more data types.

On Windows there is a component market selling libraries for those ecosystems.

Then we have middleware vendors for game consoles as well.

Maybe this isn't a market Rust community cares about and that is fine.


This is not the only case. It harms not only performance, but also conformance, e.g. https://developercommunity.visualstudio.com/t/unable-to-move....

Microsoft is stupid enough since the first decision of the implementation impacting the ABI is made. There was no one enforcing such bad decisions shipped into the productions at the very beginning. Both libstdc++ and libc++ have more flexible rules to preventing ABI breakage.


I don't know, ABI changes are super annoying.


I didn't see this mentioned in any replies, but I think another reason why this is wanted is for Rust in the linux kernel development. If Rust can compile using GCC tooling, it should be easier to integrate into the build process for the Linux kernel. As it stands now, if Linux kernel development allowed Rust in areas of the kernel outside of modules and drivers, you need the GCC tool chain to build the kernel and the llvm toolchain. So it also meets a requirement, or at least greases the wheels, for the possibility of getting Rust into the Linux kernel.


And while you can link a gcc-compiled kernel with llvm-compiled rust, apparently you lose some link-time optimization, and with that control-flow integrity [1]. That's a potential security concern for the kernel, in addition to the usability issues of using two toolchains.

1: https://www.cs.ucy.ac.cy/~elathan/papers/tops20.pdf


rustc_codegen_gcc will solve the same problem. I would expect it to be possible to do LTO and CFI across rustc_codegen_gcc and gcc-compiled C or C++ code.


GCC already has compiler front ends for Fortran, Ada, Go, and formerly Java. Adding another language, especially one as popular as Rust, couldn't hurt.


also D and Objective-C!


Huh, I didn't realize that


A common misconception is that GCC stands for GNU C Compiler, but it stands for GNU Compiler Collection. Perhaps why you did not realize ;)


GNU C Compiler was the original name. A lot of us probably never bothered updating our mental acronym tables.


You mean to tell me that the acronym settled in your head between March and December 1987 and you never reconsidered since then?


Nope. It settled in my head before it was renamed in 1999: https://gcc.gnu.org/wiki/History#Reunification


I didn't know that either! And I've even used it before, haha


But I assume the name of the gcc binary stands for GNU C Compiler?


Multiple implementations of a standard help shine light into dark corners.


Indeed, but I think they should first create a standard?


If not a standard, then at least a specification. This discussion is from 2020, so I'm not sure if it's still up to date:

https://users.rust-lang.org/t/where-is-the-rust-language-spe...

"For the most part though, rustc itself is the spec" - so, for implementing the GCC frontend, read the current LLVM frontend really carefully and do everything the same way? And try to keep up with all the changes which will inevitably happen in the LLVM frontend while you are implementing the GCC frontend?


> so, for implementing the GCC frontend, read the current LLVM frontend really carefully and do everything the same way?

At a high level, this is correct: https://github.com/Rust-GCC/gccrs/wiki/Frequently-Asked-Ques...

> If gccrs interprets a program differently from rustc, this is considered a bug.

I don't believe they are only reading the frontend, but using the reference first, then looking to the implementation second, and asking a lot of questions along the way.


Integration-style tests are a good tool here. Given a set of programs in the target language with certain expected behavior on the reference compiler, you check that compiling and running with the second compiler produces the same behavior.


There are various reference documents for Rust, but how do you know if they're complete and accurate without creating a working interoperable implementation?

ISO will stamp any fantasy like OOXML, but that doesn't mean the standard is useful.


Having another implementation first actually helps create a good standard, by highlighting things that are implementation-specific assumptions.


Rust already has stable standards. There's Rust editions 2015 (1.0), 2018 (1.31.0), and 2021 (1.56.0).


Unfortunately that's not quite the same thing as a standard in the sense that is meant here. A 'standard' is a written document that explains how the language should work. A standard that just says "the compiler is correct if it compiles code the way that rustc 1.xy does" is technically unambiguous in that it provides a procedure by which a compiler author can check their work, but leaves quite a bit on the table and is probably the wrong goal to aim at in the long term, even if it enables forward progress for now. (For example, we know rustc contains bugs; do we want gcc-rust to include all the bugs, or should it differ from rustc in that respect? Which behaviors are bugs, which are undesired but accepted quirks of the language, and which are intentional? For that matter, it's unreasonable to expect gcc-rust to produce binaries that are byte-for-byte identical, so what other mechanism should be used to determine equivalence?)

(btw, there's also https://doc.rust-lang.org/reference/)


Correct.

With an ideal standard document that describes the language, someone could build a compiler for the language from scratch and it should behave correctly. See https://en.wikipedia.org/wiki/Programming_language_specifica...

Some example standards:

C: https://www.open-std.org/jtc1/sc22/wg14/

ECMAScript (Javascript): https://tc39.es/ecma262/

C++: https://isocpp.org/std/the-standard

Scheme: https://schemers.org/Documents/Standards/

Ada: http://www.ada-auth.org/standards/ada12_w_tc1.html

For Rust, Ferrous Systems is working on the Ferrocene Language Specification to formally document the Rust subset that Ferrocene will use.

https://ferrous-systems.com/blog/ferrocene-language-specific...

https://ferrous-systems.com/ferrocene/


Another reason besides platform support, is bootstrapping and verifiable builds. rustc can only be built by a recent rustc, so it is a pain to bootstrap from source if you don't want to trust a big binary.


https://github.com/thepowersgang/mrustc is already providing that; it targets 1.54 right now, which is farther along than this project (though obviously that may not be always true in the future).


Ah, I didn't realize it could do 1.54 now, neat! Last time I looked at the state of bootstrapping rustc in Nix and Guix, mrustc could only do 1.20-something, and you had to build a dozen rustcs to get to the current version. Does mrustc keep up now though? With 1.63 in beta it's still quite a chain to build from 1.54.


A significant chunk of this drive is for the Rust/Linux kernel project.


Gcc has more stable hardware targets than llvm, for starters.


Note that there is a separate project to use the GCC backend with the Rustc frontend. At least in the short-term, that project will be the better approach to providing the benefits of more hardware support, since it will be more complete and compatible with the rustc/llvm compiler.

The main benefit of a GCC front-end is to move Rust beyond a single vendor, and shakeout ambiguities in the (informal) language specification.


It is also useful for building rustc from source without any rustc binaries, although mrustc (a Rust to C transpiler) was supposed to be good for that too.


alternate architectures; plurality


Isn't this already addressed by the GCC backend for rustc? (Via libgccjit)


This is very exciting for a large C/C++ project I work on. From what I understand, someone could reasonably pilot a single Rust module to see how things go without shaking up the entire build system.


Does having rust support for GCC mean potentially faster rust compiler times compared to comping off of LLVM platform?


I mean maybe, sure? I don't think that's their end goal, as their main purpose is to have a gcc version for gcc folks, whether it's faster or slower than llvm isn't a goal though. I think first goal is to get it done and stable and forces in place to keep up with the ever updating Rust spec(s). It would also allow for a much wider number of platforms as gcc compiler is much higher coverage of platforms especially legacy ones.


I doubt it, specially if you compare agains the GCC rustc backend. I would expect most of the difference (If any) to come from the frontend. For example, if it doesn't do borrow checking, it might be able to shave off a bit of time. But at that point, it highly depends on the workload (i.e the code you're compiling). I would expect the vast majority of rust code to bottleneck on codegen, in which case there should be approximately 0 difference. For code bottlenecked on borrow checking (Which I would expect to be an outlier), gccrs should be faster while ignoring borrow checking (Although my understanding is they plan to eventually use polonius for borrow checking, a rust library/project that originally intended to replace the current borrow checker in rustc but still hasn't. In which case, it would probably be faster for some borrow checking and slower for others)


No. LLVM is faster than GCC so most likely it'll be slower. If the way the frontend generates code for the backend is completely different then there's a chance it won't be terrible and will have reasonable build speeds. However there's a chance it'll be inconsistent with llvm output but chances are noone will notice or care if it's not buggy


Doubt it. But that's OK.


Just like Mozilla's browser might be eventually a footnote in history (I hope not), this too has a potential impact greater than Open Source Security's main project (grsecurity). Kudos.


At 3% market share it looks it might be eventually become true, and it is my favourite browser still.


The rust borrow checker is under-specified (for a good reason). Can forsee lots of incompatible with two implementation


It won't be. They plan to use the exact same code!

Rust is designed to compile fine without any borrow checking (it reduces it to the C level of safety, but valid programs generate valid code).

GCC will compile itself without a borrow checker. Then it will compile the existing borrow checker written in Rust, and then recompile itself with borrow checking.


> They plan to use the exact same code!

Did they announce a change of plans? Their website just says that they have no plans for a borrow checker (i.e. it's not required to actually implement rust).

From the website for the project:

> There are no immediate plans for a borrow checker as this is not required to compile rust code and is the last pass in the RustC compiler. This can be handled as a separate project when we get to that point.

Link here: https://rust-gcc.github.io/


The "separate project" you mention is integration of Polonius: https://lwn.net/Articles/871283/


I appreciate the link, thank you.


[flagged]


If my information was inaccurate or out of date, you could have corrected me. I find your antagonizing cheer hurtful, and such a mean comment is unhelpful. Please be kinder.


The Rust type system wonks are highly motivated to precisely specify the entire type system, including the borrow checker. A new Types Team was formed earlier this year with the goal of producing a specification against which new type system features can be prototyped for soundness before being approved (borne out of, among other things, frustration with the long-unstable and long-incomplete "specialization" feature, which nobody has yet found a sound formulation for).


Yeah, this would worry me too. It feels like the rustc policy on this is just to make the compiler smarter and smarter so it can accept more code that's actually valid (but couldn't prove was the case in previous versions). That's a great goal, but that makes it really hard to nail down what is and isn't allowed in a particular version of Rust. And if we assume that gcc Rust will always be playing catch-up to rustc, they will probably have a hard time enumerating rustc's borrow checker's behavior precisely enough to be able to replicate its behavior exactly in gcc Rust. So that means some code that rustc is ok with, gcc Rust might not be, and vice versa (even though all the code might be fine).


What good reason would there be for under-specifying it, that would create problems? Like, I can see wanting to leave flexibility in the implementation, but then different implementations should be fine if they both match what spec there is. The only problem would be if one implementation is more strict than it "admits" to being (in its written spec), which is a problem anyways.


I think there are two things that could be specified here:

1. What the borrow checker protects against, and what constitutes valid code.

2. What the borrow checker can actually prove is valid code.

#2 tends to be much less than #1, because the compiler can only be so smart, and if it can't prove something correct (even if it might be correct), it takes the safe route and refuses to compile the code.

#1 is probably pretty well specified, or at least wouldn't be hard to write down if someone really wanted to. #2 is a moving target, because the borrow checker in rustc gets smarter with some releases, and compiles code that it used to reject (because it's been taught to understand that code better and can prove it correct). So it's a lot harder to specify exactly what kinds of code the borrow checker will accept and reject.

I could easily see a situation where gcc-rust claims to support all the language and stdlib features that a particular version of rustc supports, but has subtle differences in borrow checker behavior that causes it to reject some code that rustc accepts (or accept some code that rustc rejects). The behavior isn't incorrect, per se, but it would be frustrating for developers.

It's also possible there could be similar issues with lifetime analysis.


a naive question - would it be faster than the "regular" rust compiler?


I can't seem to find any benchmarks around it. I'm assuming as this gets further integrated into GCC, you'll start to see those crop up.


Hopefully this will slow down development of new Rust features to a reasonable pace. Instead of Rust devs writing code that requires a new compiler every 3 months it'll be more like every year or two. Almost useable.


This is an odd take that I feel is rooted in a misunderstanding.

rust intentionally has a fairly rapid release schedule, so that you don't end up with a big release every year that introduces several possibly breaking changes.

Features aren't randomly scheduled for each release. Instead each release is a snapshot of what's been stabilized by that point.

If you prefer some arbitrary concept of stability, then you can quite easily lock to any version of rust this far and ignore any new versions till something catches your fancy.


This is the perspective of a developer. I am trying to explain the perspective of an open source desktop user who's ran into the backwards incompatibility in 4 out of 5 rust tools I've attempted to compile for use.

The Rust language is fine. Great, even. Most Rust devs are bleeding edge types that always use the latest features. Hopefully this changes as Rust matures.


The rust compiler rarely breaks backwards compatibility, so I'd be curious to see what issues you actually hit.

Perhaps you mean forwards compatibility, in which case yes perhaps that's an issue but I'm not sure why you'd want bigger, slower release cadence when it would likely make your problem worse , when libs update but updating your compiler potentially causes significant other changes as is the case with C++ for example


I feel like the set of people who

* Want to compile the software, not just install a pre-built binary * But, don't want to actually hack on the software and so don't care about up-to-date tools

... is probably rather small. Even for them, though, surely just an old source version will work? Or is there some reason that doesn't help?


I'm not sure I fully understand the gripe you have here. Rust releases a new compiler every 6 weeks. Many releases just improve optimization and a handful of newly stabilized APIs, or existing APIs newly made `const`, or whatever. When a big feature is finished, it is put out in the next release.

There is no sense in which this increases the pace at which these new features are developed or released; if Rust released a new version yearly, we'd simply see extremely large releases some years (including, say, the new async/await system) and relatively small releases in others.

What it does mean is that backwards-compatible refinements and bugfixes can be quickly and easily added to the language, which I think is a good thing.

I'd be interested to know what you'd prefer, though!


What does a "front-end" mean exactly, in this context?


At a high level, GCC and LLVM are divided into two ends: the front and the back. The front end takes the source code and turns it into some form that's language agnostic (e.g. LLVM bitcode). The back end takes those structures and turns it into machine code. This allows the compiler to be modular.

Old compilers didn't use this concept. So a C->x86 compiler and a Fortran->x86 one couldn't share code as easily. And if you wanted to expand to, say, ARM, Alpha, etc. targets, it gets worse. You end up needing O(l*t) compilers[a] to be homogenous. It gets even worse if you want to add optimization. With this model, the whole system is working on an AST, and therefore, the optimizers are tailored to the individual compiler varient, so they end up being just as unportable.

However, with a modular design (front and back end), you can have C->IR and Fortran->IR front ends, then a single back end for each target: IR->x86 and IR->ARM. If you want to add, say, Ada support, you only need to write an Ada->IR module, and the back ends handle the rest. In the end, you only need O(l+t) modules. You also get the benefit of agnostic optimizers as they only need to support your internal IR.

[a]: 'l' is languages and 't' is targets


Note that the language independence of the middle layer(s) is often incomplete. Language-level constructs sometimes get smuggled through the intermediate layer and get translated in code generation. Or the middle layer is theoretically generic but valid variants that haven't been used before are buggy or unimplemented.

I think Rust on LLVM has a major example of this? IIUC, it's a big problem to turn on strict aliasing because LLVM doesn't actually implement it correctly, it just works well enough for C/C++ etc.? (But I'm no expert.)

Still, the frontend/backend split gets you a lot closer to supporting all of the l*t pairs than the all-in-one approach. Porting is a matter of fixing bugs and filling in missing parts rather than starting from scratch.


The issue was that LLVM codegen produced buggy output with strict aliasing, but this wasn't really caught until Rust since Rust can use it several orders of magnitude more frequently than C / C++ frontends would.

AIUI, the solution wasn't that Rust smuggled the language-level construct through the intermediate layer, but that they simply held off on expressing these constraints until bugfixes could be merged upstream into LLVM. The only downside to disabling this was simply that compiled code would potentially be a bit less optimal than otherwise, so such tactics weren't really necessary.


> I think Rust on LLVM has a major example of this? IIUC, it's a big problem to turn on strict aliasing because LLVM doesn't actually implement it correctly, it just works well enough for C/C++ etc.? (But I'm no expert.)

It's noalias (C99's restrict code, in essence), that you're thinking of, not C/C++ strict aliasing rules.


For sure, leaky abstractions will get through; Rust has had quite a few issues LLVM (you mention one), but it's still a lot better than the all-in-one system we had before.


Worse than not being completely "language independent" sometimes LLVM's Intermediate Representation is not well defined at all.

This is most common where C++ (as the main consumer of these semantics) doesn't define certain semantics, or, the semantics it standardises are just impossible to optimise so nobody really delivers them (ie they don't always work in your C++ programs when you compile them with actual modern C++ programs)

As I understand it an example when it comes to aliasing would be what happens if somebody typed a bunch of ASCII into a prompt and then the program... uudecodes the ASCII to get an integer, and begins just using that as a pointer to rummage around in some data structure.

Is that... OK? Obviously you can't do this in safe Rust, but even unsafe Rust says er, no, that's definitely not allowed (it might actually work but it isn't OK). But the ISO C++ standard says that so long as the ASCII string happened, by some cosmic accident, to be a "valid" address for a pointer after this transformation this works fine even if it now magically aliases a pointer we otherwise had no reason to believe could be aliased.

If we allow this, many optimisation opportunities vanish. So, LLVM doesn't allow it. Our "correct" C++ pointer uudecoding program doesn't work. However, a blanket prohibition blows up real tricks that people actually do, such as hiding flag bits in address values (unlike uudecoding ASCII inputs to make pointers). So LLVM provides behaviour that's not formally standardised anywhere but can be thought of as something akin to PNVI-ae (Provenance Not Via Integers - Address Exposed). If the program "exposes" addresses from pointers, then the compiler assumes it could see those addresses "magically" appear from somewhere else (e.g. a uuencoded string) and so it must choose optimisations accordingly.

One day perhaps C++ will actually document PNVI-ae or some similar scheme in the ISO standard. That day is not today (nor next year, this will not happen in C++ 23) but meanwhile you've got the problem that, in unsafe Rust you actually get whatever arbitrary semantics that were delivered by LLVM. Since they're not just "This is what C++ does" but only "This is how C++ works in Clang" that's even less portable.

As I wrote, this only burns unsafe Rust. If you don't write unsafe and just depend on other people's stuff to use that as necessary (e.g. obviously the standard library is full of unsafety) then it's not your problem to fix this when it breaks. But it sure would be nice if the people writing unsafe code would have more certainty on this sort of topic.

Rust is experimenting with "Strict Provenance" rules to see what happens if, instead, it makes its own provenance rules and dispenses with waiting for C and C++ programmers to actually decide what the rules are in their languages. https://doc.rust-lang.org/std/ptr/index.html#strict-provenan...


This is incorrect. C and C++ have aliasing rules; the compiler is allowed to assume that your magic pointer does not alias an object that otherwise you had no reason to believe could be aliased. Google "strict aliasing rule" to learn the details.

Essentially, C and C++ already have a version of provenance for pointers and references, but it isn't identical to the Rust version.

Because some old codebases pull tricks of this kind (messing with pointers, doing casts without going through unions, etc), gcc (and clang) have a flag, -fno-strict-aliasing , to disable some optimizations that would be enabled by taking aliasing/provenance into account.


I understand the confusion. C and C++ do indeed have aliasing rules and those rules do, as you point out, forbid type punning tricks. But as you can see I wasn't talking about type punning. Our uuencoded value isn't a type pun, and our pointer isn't the wrong type, it's the correct type with, miraculously, the correct value - It's a magic trick!

The C++ Standard is OK with this magic trick. Executing this trick in the C++ abstract machine is no problem at all.

However in a real world compiler this is a huge problem - because the optimiser assumes I can't possibly have such a pointer. Where would I get it from? This is where provenance comes into the picture. Where did my pointer come from, that's what provenance means. The C++ standard has nothing to say about provenance but your compiler depends on it.


You are incorrect. The C++ standard says a whole lot about it, and has detailed aliasing rules and rules about which pointers are valid. Your "magic trick", depending on details, isn't a standard conforming program; the standard will say that it has undefined behavior. That's why, as you say, the optimizer assumes you can't possibly have such a pointer, but what you get wrong is that the C++ standard describes the rules, in detail.

I've been doing this stuff for a long time, going back to egcs and even before.


>> The C++ standard says a whole lot about it

Nope, it says nothing whatsoever on the subject of provenance. Periodically people make a run up and try to get this fixed. N2676 is currently in front of WG14 (ie the C standards committee) with the long term hope that if WG14 takes this fix, or something like it, WG21 (C++) could be persuaded to eventually take a similar fix - although lots of people don't like N2676 and want something else (the more vague your "something else" the more popular). If the standard had "a whole lot" to say about provenance you'd be able to quote some of it. but I suggest reading N2676 for an example of what is not yet standard

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2676.pdf

> has detailed aliasing rules and rules about which pointers are valid

Aliasing rules forbid some type punning shenanigans, but again, and I will repeat myself, that's not what's going on here.

DR260 (Defect Report number 260, about twenty years ago) asks WG14 what their standard says about such magic tricks, and their response basically says well, compilers are allowed to somehow know about provenance, so actually our Standard is correct and this is working as intended. What's working as intended? Well, whatever your compiler actually does.

What a great standard! Note that this response isn't incorporated into subsequent versions of the standard, it's just basically known in the industry, oh yeah, that's DR260, don't worry about it.

> I've been doing this stuff for a long time, going back to egcs and even before.

That's nice, but it's not terribly relevant here, except that it means you remember an era when people didn't even realise this was a problem. It still was a problem, they just didn't know it was a problem yet. And hey, you know about that experience too, because apparently you didn't know this was a problem after 2004 either.

Rust would like not to kick this can down the road for 18 years and counting, which is why Aria Beingessner's experiment is happening.


As a small point of order here (not saying you're wrong, just to elaborate a bit):

* Rust doesn't have strict aliasing/tbaa

* Rust does have "restrict" semantics (in C via the keyword and in C++ via common vendor extensions)

And yeah, Rust is considering a provenance model that's different than PNVI-ae.


Yes, understood. But in Rust (without going through unsafe code), you can't write a program that will produce a reference to a u16 that overlaps with half of a global u32. In C/C++ you could cast a pointer that way and turn it into a reference, but the compiler is allowed to assume, even though you wrote that, that the u16 (unsigned short) reference can't refer to some particular global u32 (unsigned int), and if it has it in a register it won't reload it. The program has undefined behavior (you could use unions and get implementation-defined behavior, which will depend on little-endian vs big-endian).


> without going through unsafe code

None of this matters in Rust without unsafe. Pointers technically exist outside unsafe, but you can't do very much with them (you cannot, for example, dereference a pointer). And yes, Rust says that the attempted type pun you describe, which you would need unsafe to create, is Undefined Behaviour.

> you could use unions and get implementation-defined behavior

Type punning with unions is also Undefined Behaviour in C++. The sanctioned way to perform type punning is to memcpy() the data from type A to type B. C++ 20 provides a built-in way to ask for this to be done std::bit_cast


Would it ever make sense to build a chip/vm (something like the JVM) designed to run the intermediate code directly?


Been there, done that!

https://en.wikipedia.org/wiki/Jazelle

Jazelle was an instruction set extension to allow natively running java bytecode on Arm processors. Arm never seemed to give it any proper care and attention necessary for it to make any actual market impact, they didn't even publish the ABI.

Transmeta, that used to employ Linus Torvalds at one stage, had Code Morphing Software: https://en.wikipedia.org/wiki/Transmeta#Code_Morphing_Softwa...

It was built in to the silicon, that essentially operated as a full JIT for x86 software, translating to back-end VLIW instruction set. Transmeta seemed heavy on the marketing and buzz (hiring Linus was very much a marketing move), light on the actual execution. Crusoe didn't even remotely live up to their own buzz and it just about killed off their chances of success.


Generally not. There are cases where something akin to intermediate code is shipped (e.g., SPIR-V or PTX for GPU code), but even then, the intent is that it is converted to a binary machine code in the driver.

Intermediate code tends to have downsides that make it hard to work with in an actual hardware implementation. Infinite registers, for example, which makes encoding instructions challenging.


Generally, no. The IR will be tuned for compiler purposes, not execution purposes. It can be possible but it won't be optimal. For instance, the IR will probably retain type information that generally CPUs don't care about, since they live in a world of bits. The IR will have been designed knowing it is upstream of optimization code, so it won't be something that is already optimal, it'll be something designed to be easily made optimal in many different ways.

As Blackthorn mentioned, specifically it can be done. There was a chip that ran JVM bytecode directly. There was an entire Lisp Machine which was very influential... but not because of its high performance.

Wikipedia has a page on this: https://en.wikipedia.org/wiki/High-level_language_computer_a...

Many people decry the lack of innovation in computer architecture over the past 50-60 years. There is a legitimate reason for that, though: Any such innovation had to outrun the exponential increases conventional architecture was experiencing. We had it for so long we could take it for granted, but while computing wasn't the first and won't be the last to see periods of exponential growth, I can't think of anything else that had the length of run that computer chips had. (And even now, it's only slowed down. There's still multiplicative advances rather than additive ones every year. The base on the exponent is just smaller than it used to be.) There's been a number of architectures based on running IR (or an equivalent directly) that didn't pan out, but there's also some that did run OK, but they just couldn't keep up with the exponentially accelerating behemoth.

I think if silicon continues to plateau, there is some hope that this could change. However, a new challenger has appeared! Now it's not good enough to just outrun what a "conventional" architecture can do with CPU and RAM and peripherals... now you also need outrun what a GPU-based architecture can do! This sucks up a lot of the oxygen. For instance, right now you see some tentative steps towards special-purpose AI silicon, but it has to do something amazing to beat out a GPU, and you also have to be really, really sure that the AI technique you are committing to silicon will still be the state-of-the-art in the couple of years it'll take you to bring it to market. AI hasn't been moving as fast as, say, JS front-end frameworks, but it's still moving fast enough that I'd be nervous in investing in that vs. the same amount of effort poured into optimizing a GPU algorithm on GPUs that are still advancing a lot every year.


I don't think these limitations are ones that really impact hardware all that much. It wouldn't be impossible to create hardware with the concept of "infinite registers" that ultimately gets optimized away. (see: Mill CPU)

The bigger issue (IMO) is that the IR for compilers tends to evolve rapidly while hardware is stuck in the mud. Moving the problem of finalizing the IR into the hardware will effectively make it so you'd change your compiler to emit IR that targets old IR.

This is why, for example, x86 will get a new instruction which will effectively go unused for 5 to 10 years (except in extreme cases where the performance gains are worth the cost of writing code to detect and use those new instructions... For example, FMA). Those who compile code want it to be able to run most anywhere in not so surprising fashions, so they'll target the LCD.

Now, imagine someone was building a JVM bytecode chip today. The JVM is on a 6 month release schedule. That's incredibly hard for a hardware manufacturer to want to keep updating their chips at that rate. Further, even if they targeted "LTS" versions, they've moved that to a 2 year cycle. Again, hard to really expect customers to want a new JVM chip every 2 years.

The likes of GCC and LLVM IR change and expand just as rapidly (if not moreso) than than JVM.

There is the option of something like FPGA or programmable hardware leaking into more places, but IMO, the HDLs and their tools simply suck too bad to expect FPGA on an ASIC to really take off. We need a first mover here and after that happens, I expect at least 5 to 10 years before the tools get to a level that doesn't completely suck. Even then, while access to GPGPUs is now pretty much universal, GPGPU programming still feels like it is in the stone ages. Despite being a thing for over 10 years. So how can we expect HDLs to evolve at a faster pace?


" Even then, while access to GPGPUs is now pretty much universal, GPGPU programming still feels like it is in the stone ages. Despite being a thing for over 10 years."

I'd be interested in reading an article from someone who has been doing it for 10 years as to why that is the case. I have theories but nowhere near enough direct experience to evaluate.

(My hypothesis is that the extreme parallelism makes it so very tiny mistakes have catastrophic performance impact by introducing accidental serialism, and as a result, it is very difficult to create an "easy to use" framework that doesn't abstract too much away and make it trivially easy to introduce even a tiny such error and crash performance. We actually make this mistake all the time in conventional CPU code, it just generally just costs you small integer multiples of performance instead of large integer multiples of performance.)


My hypothesis is more cynical.

We have 4 major GPGPU manufactures (Apple, Intel, AMD, nVidia) and none of them want to make creating an open standard easy. There is no reason why nVidia should support OpenCL the same way that AMD does, they want people to write CUDA. There's no reason for Apple to support OpenAAC like Intel does, they want people to write Metal... etc. These 4 companies are trying to push everyone into their proprietary ecosystem or into an open ecosystem where they have a large say.

I had hoped that SPIR-V would be a good inroad to start fixing some of these problems, but alas, it seems like Apple and nVidia aren't big fans. It's early still, though, so maybe that changes?


AMD and Intel don't care enough about OpenCL to actually provide the same tooling and libraries that CUDA does.

OpenCL 2.x was a failure, hence why OpenCL 3.0 is basically 1.2 with everything else from 2.x marked as optional.


> It wouldn't be impossible to create hardware with the concept of "infinite registers" that ultimately gets optimized away. (see: Mill CPU)

I'm not sure that the vaporware Mill CPU acts as an existence proof for anything, to be honest.


It was a PoC that didn't get funding because the advantages it touts aren't nearly useful enough compared to simpler/familiar architectures. That, and it was proprietary which, come on guys, that was never gonna fly (I still have my doubts about RISC-V going very far).

The problem of infinite registers is pretty easily solvable. Heck x87 already has that basic concept down with the stack registers. The only missing piece is moving overflow values onto and off of the stack. It's a fairly easy to solve problem.

That's effectively what Mill proposed in their docs. The "belt" notation was their cute way of doing that.


> The problem of infinite registers is pretty easily solvable. Heck x87 already has that basic concept down with the stack registers. The only missing piece is moving overflow values onto and off of the stack. It's a fairly easy to solve problem.

You still have a limited window of 8 registers. Need 9 simultaneous live values? Oops, too few registers! In order to encode infinite registers, you need to be able to support referring to an arbitrary number of registers at once, which means you need an arbitrary-length register number specifier... have fun handling that in hardware!


> The only missing piece is moving overflow values onto and off of the stack.

Already addressed and not really that hard of an issue to solve. Hardware already has all the logic embedded in it to be able to load or store memory into/from registers.

The logic would mirror the logic done by compilers when they are allocating registers and deciding what needs to be evicted to the stack.

The only missing piece is where that memory should be, but that's really as simple as having either the OS or the application logic allocate a special memory region for the purpose of register evictions and loads.


And yet we are in the middle of adoption of hardware memory tagging architectures, as last resort to fix C and the languages copy-paste compatible with it, the irony.


You actually can - https://llvm.org/docs/CommandGuide/lli.html

LLVM's original purpose was something analogous to the JVM, but it's obviously evolved to be more of a compiler platform, debugging toolchain, etc these days

(No direct hardware support, though, as cool as that sounds)


Probably not a chip, since the intermediate language is optimized for optimizing and compiling, not for executing quickly.

Meanwhile normal instruction sets are just a bytecode optimized to run quickly.

LLVM-IR at least also isn't stable, so not a good thing to bake into a chip - but you could conceivably make a stable intermediate language.


At some point in history there was a jvm chip like that. It's long gone, not sure for what reason.


I would be worried about the intermediate code not being stable over time.


Unless one keeps it stable, like Apple did for watchOS.

However they apparently don't want to keep fixing upstream changes and is now deprecated going forward.


For a chip, I think you would run into the same problems as Intel iAPX if you did that: https://en.wikipedia.org/wiki/Intel_iAPX_432


Something like a Low-Level Virtual Machine?

Short answer, it’s feasible in theory, but usually suboptimal.


How “old” are we talking about here? Multiple language front ends sharing a common IR is a concept that dates to the 50s: https://academic.oup.com/comjnl/article/22/3/226/408542 (click through to the pdf)


Being a concept and having papers written about it doesn't imply an actual implementation


Your link does not work


Thanks. Fixed. (It's not my fault young people broke direct linking on the web.)


Link is broken


Worked in the 50s though.


A hot link goes on a bun, dang it, and get off my lawn.


Very very old compilers like in the early 70's, given that PL.8 and Amsterdam Compiler Toolkit are two well known approaches to this design.


Compilers are often implemented as a front-end/back-end split.

The front-end compiles your input language (C, C++, Rust, etc) down to a low-level language called an Intermediate Representation. Then the back-end of the compiler optimizes the IR and compiles it into object code. A family of compilers will usually share the back-end.

This kind of split allows for deduplication across different compilers in the same family, and also makes it easier to design a new language without having to fully re-implement everything about the compiler yourself.

In GCC, these are bundled together as a common project afaik. In Clang/LLVM, these are split into the front-end (clang) and the back-end (LLVM).


In some stricter sense, back-end is for target-depending stuff like ISA-dependent code generation. A great deal of work in both GCC and LLVM is in the so-called mid-end. Both have more than one IRs in the pipeline after the front-end.


GCC also has front ends (compile many languages into a a common form, GIMPLE), a "middle end" (optimization), and many back ends (for different processor architectures).


Oh damn, THIS is what the "intermediate representation" i keep hearing about is.

I knew that Julia uses LLVM. I also knew that Julia has something called IR, but i didn't know what it was. So Julia's IR is probably LLVM's IR...


I suspect Julia has its own IR that is later lowered into LLVM IR. One of the aspects of multiple compilers is that it's basically a massive tower of IRs that get progressively lowered into lower- and lower-level IRs.


https://stackoverflow.com/a/43456211/1544203 is a little old, but still very much relevant. Julia has an untyped IR that it lowers into, then typed IR on which inference is performed, then it lowers to LLVM which lowers it to native. You can introspect the IR at these steps by taking a function call and doing `@code_lowered`, `@code_typed`, `@code_llvm`, and `@code_native`.

For some examples, the compiler plugins interface acts on typed IR, so things like Diffractor for automatic differentiation or JET static analysis and EscapeAnalysis.jl act there. Notably, Zygote automatic differentiation acted on the untyped IR. GPUCompiler.jl does some operations on the typed IR then lowers to the LLVM IR level and then intercepts the normal compilation process to choose alternative backends (e.g. compile to .ptx for compilation of native Julia to CUDA). Enzyme.jl uses the Enzyme LLVM-based automatic differentiation, so it does some actions on the typed IR before lowering to LLVM and then injecting the Enzyme LLVM pass before compiling via GPUCompiler.jl (now to the normal CPU backend).


Julia has an IR called Julia IR that is used before compiling to LLVM IR, it's a bit higher level than LLVM ir


Rustc has HIR, MIR and then finally llvm IR


The part of a compiler that handles tokenization, parsing, AST generation and IR emission, as opposed to the backend which is responsible for conversion of IR to a target-compatible executable.


GCC has an intermediary format, which then compiles to machine code. So the front-end is similar to the main Rust compiler that compiles to LLVM bytecode.


[flagged]


Care to elaborate?


Maybe referring to the fact that GCCRS duplicates a ton of effort by reimplementing the frontend (parser, borrow checker, error messages etc.) for questionable benefit since you could keep the existing frontend and just swap out the LLVM backend for GCC (as done by the rustc_codegen_gcc project).


This ignores the benefits of having multiple implementations, e.g. you can use places where two implementations disagree with each other to track down bugs in one or both of them.


[flagged]


What are you quoting? And making an account just for a comment goes against HN guidelines:

> Throwaway accounts are ok for sensitive information, but please don't create accounts routinely. HN is a community—users should have an identity that others can relate to.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: