Fed up with dynamic scripting languages, I decided to give a shot to statically-typed embeddable scripting language. I call it Ri.
Ri is inspired by Lisp implementations that are very tiny while still staying true to Lisp expressiveness. I feel the need for a disclaimer: The language is experimental and may not turn out as a good idea at all.
Dynamic versus static
tl;dr: Dynamic scripting languages are convenient but slow.
- JIT does help, but if you care about performance you enter the magical world of alchemy specific to the current version of the compiler.
- Dynamic types require a condition to exist somewhere in the interpreter for every use of a value.
- Lack of static checking makes your life insufferable. You have to bring in good test coverage, and a compiler (like TypeScript). So your development is slow, your code is slow, and you have to deal with types.
Hence, na-ah!
tl;dr: Statically typed languages tend to be less convenient, but are generally much faster and reliable in performance.
- The output is already optimized and checked ahead of time. Sure you can have rather nasty bugs, but from my experience there's not much difference between using
void*
and dynamic types. - They translate to a form that hardware understands perfectly and is fast with. So the performance gets under your control, and is reliable enough so you can reason about it.
- Types might be a slight inconvenience, but the value they bring to the game is immense. More over type inference is here to ease the pain significantly. Some languages tend to look indistinguishable from dynamic ones.
Backend
tl;dr: Ri compiles to RiVM bytecode and C.
Nearly all new languages go for LLVM backend by default. Ri does not. LLVM is insanely large dependency, and often a pain to deal with. With current architecture the path to LLVM backend is open, but it's just not interesting that much.
Ri is designed so that it's final run time form is compatible with C. This requirement is required so there's minimal to none glue code required for interop with host application written in C.
Ri's meta-programming is inspired by Jai. Code run in compile and run time is shared. This is done by executing functions that are required to run during coompilation via virtual machine. Additionally Ri can be meta-programmed via C in embedded scenarios.
Ri compiles to RiVM:
- RiVM is primary target for embedded scenarios.
- RiVM is currently a 2-register stack machine with instruction set inspired by Q3VM (which is inspired by LCC's bytecode target).
- RiVM is not sandboxed.
- RiVM bytecode is translated to x64 machine code ahead of time, for performance in run time.
- Ri's compiler depends on RiVM to run code in compile time.
- Ri can be used for building and packaging the application too, because of the compile time execution.
- Compile time and run time share the same code.
Ri compiles to C:
- Code generated is easy to read and easy to interface with.
- Code generated can be used by a C application without any other dependency.
- Generator also produces all glue code needed to run Ri in runtime.
- Glue code can also be done manually.
This should allow for entire applications to be written in Ri and it's compile-time meta programming, compiled with the top compilers, interfacing with embedded Ri without any glue code.
Syntax
Ri's syntax is based on three types of expression:
- Identifier is a string. Identifiers are keys in map of expressions. There are two types: alphanumeric identifiers like
for
orVector2
and punctuation identifiers called operator identifiers, like:
. - Call is a function call in format:
<function> {}
. The<function>
can be an expression that resolves to a function value in compile time. Functions always have one argument: a block. - Block is the primary data structure denoted by
{...}
. In it's rawest state, blocks are arrays of expressions, like{ a == b; b = 2 }
or{ a: 3; print {"a is %d"; a} }
.
Probably the most interesting part of the syntax is block, which fills several roles:
- Static arrays –
{ 1; 2; 3; 4 }
- Static dictionaries –
{ x: 1; y: 2 }
- Function arguments –
print {"hello %s"; "polly"}
- Code blocks –
while {true; { print {"eternal flame"} }}
- Composite definitions –
struct { x: Float32 {}; y: Float32 {} }
- Enum definitions –
enum { True: 1; False: 0; FileNotFound: 2 }
Ri's parser doesn't recognize any keywords. Keywords are identifiers mapped to compile-time functions. Compile-time function is responsible for interpreting the arguments, checking types, and returning a node that replaces the call node.
Ri does recognize arithmetic, assignment, binary and boolean operators. Their syntax and semantics are so far designed to be compatible with C. I have not yet decided whether and how I'll be dealing with function/operator overloading.
There are of course some "negative" implications of all this:
- There's a difference between
a: Int
anda: Int {}
. Former one declaresa
as an alias ofInt
(atypedef
) and latter declaresa
as uninitializedInt
value. - Semicolon is a dominant separator, which is a little hard to get used to (there might be some instances of me using comma incorrectly even in this article).
- Because of compile time code execution, it is possible that the compilation will never terminate. You also need to take care of your memory allocations, as there's no garbage collector. I decided to not care about this at all, but it's something that might come as news.