Hey hey hey good evening! Tonight a quick note on
wastrel
, a new WebAssembly
implementation.
a wasm-to-native compiler that goes through c
Wastrel compiles Wasm modules to standalone binaries. It does so by
emitting C and then compiling that C.
Compiling Wasm to C isn’t new:
Ben Smith
wrote
wasm2c
back in the day and these days most people in this space use
Bastien
Müller
‘s
w2c2
.
These are great projects!
Wastrel has two or three minor differences from these projects. Let’s
lead with the most important one, despite the fact that it’s as yet
vaporware: Wastrel aims to support automatic memory managment via
WasmGC
, by embedding the
Whippet
garbage collection library
. (For the
wingolog faithful, you can think of Wastrel as a
Whiffle
for Wasm.) This is the whole point! But let’s come back to it.
The other differences are minor. Firstly, the CLI is more like
wasmtime
: instead of privileging the production
of C, which you then incorporate into your project, Wastrel also
compiles the C (by default), and even runs it, like
wasmtime run
.
Unlike wasm2c (but like w2c2), Wastrel implements
WASI
. Specifically, WASI 0.1, sometimes known as
“WASI preview 1”. It’s nice to be able to take the
wasi-sdk
‘s C compiler,
compile your program to a binary that uses WASI imports, and then run it
directly.
In a past life, I once took a week-long sailing course on a 12-meter
yacht. One thing that comes back to me often is the way the instructor
would insist on taking in the bumpers immediately as we left port, that
to sail with them was
no muy marinero
, not very seamanlike. Well one
thing about Wastrel is that it emits nice C: nice in the sense that it
avoids many useless temporaries. It does so with a lightweight
effects
analysis
,
in which as temporaries are produced, they record which bits of the
world they depend on, in a coarse way: one bit for the contents of all
global state (memories, tables, globals), and one bit for each local.
When compiling an operation that writes to state, we flush all
temporaries that read from that state (but only that state). It’s a
small thing, and I am sure it has very little or zero impact after
SROA
turns locals into
SSA values, but we are vessels of the divine, and it is important for
vessels to be C worthy.
Finally, w2c2 at least is built in such a way that you can instantiate a
module multiple times. Wastrel doesn’t do that: the Wasm instance is
statically allocated, once. It’s a restriction, but that’s the use case
I’m going for.
on performance
Oh buddy, who knows?!? What is real anyway? I would love to have
proper perf tests, but in the meantime, I compiled
coremark
using my GCC on x86-64
(-02, no other options), then also compiled it with the current wasi-sdk
and then ran with w2c2, wastrel, and wasmtime. I am well aware of the
many pitfalls of benchmarking, and so I should not say anything because
it is irresponsible to make conclusions from useless microbenchmarks.
However, we’re all friends here, and I am a dude with hubris who also
believes blogs are better out than in, and so I will give some small
indications. Please obtain your own salt.
So on coremark, Wastrel is some 2-5% percent slower than native, and
w2c2 is some 2-5% slower than that. Wasmtime is 30-40% slower than GCC.
Voilà.
My conclusion is, Wastrel provides state-of-the-art performance. Like
w2c2. It’s no wonder, these are simple translators that use industrial
compilers underneath. But it’s neat to see that performance is close to
native.
on wasi
OK this is going to sound incredibly arrogant but here it is: writing
Wastrel was easy. I have worked on Wasm for a while, and on
Firefox’s
baseline
compiler
,
and Wastrel is kinda like a baseline compiler in shape: it just has to
avoid emitting boneheaded code, and can leave the serious work to
someone else (Ion in the case of Firefox, GCC in the case of Wastrel).
I just had to use the
Wasm libraries I already
had
and
make it emit some C for each instruction. It took 2 days.
WASI, though, took two and a half weeks of agony. Three reasons: One,
you can be sloppy when implementing just wasm, but when you do WASI you
have to implement an ABI using sticks and glue, but you have no glue,
it’s all just
i32
. Truly excruciating, it makes you doubt everything,
and I had to refactor Wastrel to use C’s meager type system to the max.
(Basically, structs-as-values to avoid type confusion, but via inline
functions to avoid overhead.)
Two, WASI is not huge but not tiny either. Implementing
poll_oneoff
is annoying. And so on. Wastrel’s
WASI
implementation
is thin but it’s still a couple thousand lines of code.
Three, WASI is underspecified, and in practice what is “conforming” is a
function of what the Rust and C toolchains produce. I used
wasi-testsuite
to burn
down most of the issues, but it was a slog. I neglected email and
important things but now things pass so it was worth it maybe? Maybe?
on wasi’s filesystem sandboxing
WASI preview 1 has this
“rights”
interface that associated capabilities with file descriptors. I think
it was an attempt at replacing and expanding file permissions with a
capabilities-oriented security approach to sandboxing, but it was only a
veneer. In practice most WASI implementations effectively implement the
sandbox via a permissions layer: for example the process has
capabilities to access the parents of preopened directories via
..
,
but the WASI implementation has to actively prevent this capability from
leaking to the compiled module via run-time checks.
Wastrel takes a different approach, which is to use Linux’s filesystem
namespaces to build a tree in which only the exposed files are
accessible. No run-time checks are necessary; the system is secure by
construction. He says. It’s very hard to be categorical in this domain
but a true capabilities-based approach is the only way I can have any
confidence in the results, and that’s what I did.
The upshot is that Wastrel is only for Linux. And honestly, if you are
on MacOS or Windows, what are you doing with your life? I get that it’s
important to meet users where they are but it’s just gross to build on a
corporate-controlled platform.
The current versions of WASI keep a vestigial capabilities-based API,
but given that the goal is to compile POSIX programs, I would prefer if
wasi-filesystem
leaned
into the approach of WASI just having access to a filesystem instead of
a small set of descriptors plus scoped
openat
,
linkat
, and so on
APIs. The security properties would be the same, except with fewer bug
possibilities and with a more conventional interface.
on wtf
So Wastrel is Wasm to native via C, but with an as-yet-unbuilt GC aim.
Why?
This is hard to explain and I am still workshopping it.
Firstly I am annoyed at the WASI working group’s focus on shared-nothing
architectures as a principle of composition. Yes, it works, but garbage
collection also works; we could be building different, simpler systems
if we leaned in to a more capable virtual machine. Many of the problems
that WASI is currently addressing are ownership-related, and would be
comprehensively avoided with automatic memory management. Nobody is
really pushing for GC in this space and I would like for people to be
able to build out counterfactuals to the shared-nothing orthodoxy.
Secondly there are quite a number of languages that are targetting
WasmGC these days, and it would be nice for them to have a good run-time
outside the browser. I know that Wasmtime is working on GC, but it
needs competition :)
Finally, and selfishly,
I have a GC library
! I would love to spend more
time on it. One way that can happen is for it to prove itself useful,
and maybe a Wasm implementation is a way to do that. Could Wastrel on
wasm_of_ocaml
output beat
ocamlopt
? I don’t know but it would be worth it to find out! And I
would love to get Guile programs compiled to native, and perhaps with
Hoot
and Whippet and Wastrel that is a
possibility.
Welp, there we go, blog out, dude to bed. Hack at y’all later and
wonderful wasming to you all!