Forth doesn't interest me as a language any more. I thoroughly recommend learning it as a pathway into various other languages. And indeed, I consider JONESFORTH to be one of the highlights of the coding world. But these days I don't find a use for the language per se.
I have to say, however, I would love to see an interpreter for FORTH which does not actually maintain a secondary stack at runtime, but compiles the code down to something equivalent, e.g. uses the LLVM Jit.
Much of what is done with the second stack is simply reordering bits of data or duplicating bits of data. This could all be done at "compile time". Each stack location would only exist at compile time. At runtime the values would be stored in ordinary variables and references. There'd be no need to duplicate data or reorder it at runtime. Instead it would just pull the data from the correct variable or memory location.
Naturally this defeats all the principals of FORTH compilers, etc, blah, blah, blah. But it would still be a fun (and easy) thing to do.
One Forth I've seen, FreeForth, compiles subroutine threaded machine code rather than traditional direct/indirect threaded code, and it does register renaming at compile time to hide SWAPs, generating as necessary a real SWAP at the end. You do need a stack though, there are only so many registers. Even C has a stack - it just mixes up data and returns on the same one, which means it needs two extra allocation mechanisms (function return value and malloc) for data that must outlive the function's lexical extent.
Around 5% of the instructions in a C program are register reordering code introduced by the register allocator (assuming x86 and an optimizing compiler). I do not know the percentage of the register spilling code, which comes on top of this. Effectively, an LLVM JIT also produces shuffle code, but the programmer can use variable names instead of stack reorder operations.
You might want to consider a language based on combinatorics such as Fexl (see http://fexl.com). It enables you to write functions with or without intermediate symbols to the extent you see fit. In other words, you can go "point-free" or "point-full" as you like.
I believe Factor's optimizing compiler doesn't actually implement a stack, but just uses the stack structure as a handy shortcut to data-flow analysis.
I have to say, however, I would love to see an interpreter for FORTH which does not actually maintain a secondary stack at runtime, but compiles the code down to something equivalent, e.g. uses the LLVM Jit.
Much of what is done with the second stack is simply reordering bits of data or duplicating bits of data. This could all be done at "compile time". Each stack location would only exist at compile time. At runtime the values would be stored in ordinary variables and references. There'd be no need to duplicate data or reorder it at runtime. Instead it would just pull the data from the correct variable or memory location.
Naturally this defeats all the principals of FORTH compilers, etc, blah, blah, blah. But it would still be a fun (and easy) thing to do.