call_end

    • chevron_right

      Allan Day: Recent GNOME design work

      news.movim.eu / PlanetGnome • 9 January, 2024 • 5 minutes

    The GNOME 46 development cycle started around October last year, and it has been a busy one for my GNOME user experience design work (as they all are). I wanted to share some details of what I’ve been working on, both to provide some insight into what I get up to day to day, and because some of the design work might be interesting to the wider community. This is by no means everything that I’ve been involved with, but rather covers the bigger chunks of work that I’ve spent time on.

    Videos

    GNOME’s video player has yet to port to GTK 4, and it’s been a long time since it’s received major UX attention. This development cycle I worked on a set of designs for what a refreshed default GNOME video player might look like. These built on previous work from Tobias Bernard and myself.

    The new Videos designs don’t have a particular development effort in mind, and are instead intended to provide inspiration and guidance for anyone who might want to work on modernising GNOME’s video playback experience.

    A mockup of a video player app, with a video playing in the background and playback controls overlaid on top

    The designs themselves aim to be clean and unobtrusive, while retaining the essential features you need from a video player. There’s a familial resemblance to GNOME’s new image viewer and camera apps, particularly with regards to the minimal window chrome.

    Two mockups of the videos app, showing the window at different sizes and aspect ratios

    One feature of the design that I’m particularly happy with is how it manages to scale to different form factors. On a large display the playback controls are constrained, which avoids long pointer travel on super wide displays. When the window size is reduced, the layout updates to optimize for the smaller space. That this is possible is of course thanks to the amazing break points work in libadwaita last cycle.

    These designs aren’t 100% complete and we’d need to talk through some issues as part of the development process, but they provide enough guidance for development work to begin.

    System Monitor

    Another app modernisation effort that I’ve been working on this cycle is for GNOME’s System Monitor app . This was recently ported to GTK 4, which meant that it was a good time to think about where to take the user experience next.

    It’s true that there are other resource monitoring apps out there, like Usage, Mission Center, or Resources. However, I thought that it was important for the existing core app to have input from the design team. I also thought that it was important to put time into considering what a modern GNOME resource monitor might look like from a design perspective.

    While the designs were created in conversation with the system monitor developers (thank you Robert and Harry!) and I’d love to take them forward in that context, the ideas in the mockups are free for anyone to use and it would be great if any of the other available apps wanted to pick them up.

    A mockup of the system monitor app, showing a CPU usage figures and a list of apps

    One of the tricky aspects of the system monitor design is how to accommodate different types of usage. Many users just need a simple way to track down and stop runaway apps and processes. At the same time, the system monitor can also be used by developers in very specific or nuanced ways, such as to look in close detail at a particular process, or to examine multithreading behaviour.

    A mockup of the system monitor app, showing CPU usage figures and a list of processes

    Rather than designing several different apps, the design attempts to reconcile these differing requirements by using disclosure. It starts of simply by default, with a series of small graphs give a high-level overview and allows quickly drilling down to a problem app. However, if you want more fine-grained information, it isn’t hard to get to. For example, to keep a close eye on a particular type of resource, you can expand its chart to get a big view with more detail, or to see how multi-threading is working in a particular process, you can switch to the process view.

    Settings

    A gallery of mockups for the Settings app, including app settings, power settings, keyboard settings, and mouse & touchpad settings

    If my work on Videos and System Monitor has largely been speculative, my time on Settings has been anything but. As Felipe recently reported , there has been a lot of great activity around Settings recently, and I’ve been kept busy supporting that work from the design side. A lot of that has involved reviewing merge requests and responding to design questions from developers. However, I’ve also been active in developing and updating various settings designs. This has included:

    • Keyboard settings:
    • Region and language settings:
      • Updated the panel mockups
      • Modernised language dialog design ( #202 )
    • Apps settings:
      • Designed banners for when an app isn’t sandboxed ( done )
      • Reorganised some of the list rows ( #2829 )
      • Designs for how to handle the flatpak-spawn permission ( !949 )
    • Mouse & touchpad settings:
    • Power
      • Updated the style of the charge history chart ( #1419 )
      • Reorganised the battery charge theshold setting ( #2553 )
      • Prettier battery level display ( #2707 )

    Another settings area where I particularly concentrated this cycle was location services. This was prompted by a collection of issues that I discovered where people experience their location being determined incorrectly. I was also keen to ensure that location discovery is a good fit for devices that don’t have many ways to detect the location (say if it’s a desktop machine with no Wi-Fi).

    A mockup of the Settings app, showing the location settings with an embedded map

    This led to a round of design which proposed various things , such as adding a location preview to the panel ( #2815 ) and portal dialog ( #115 ), and some other polish fixes ( #2816 , #2817 ). As part of these changes, we’re also moving to rename “Location Services” to “Automatic Device Location”. I’d be interested to hear if anyone has any opinions on that, one way or another.

    Conclusion

    I hope this post has provided some insight into the kind of work that happens in GNOME design. It needs to be stressed that many of the designs that I’ve shared here are not being actively worked on, and may even never be implemented. That is part of what we do in GNOME design – we chart potential directions which the community may or may not decide to travel down. However, if you would like to help make any of these designs a reality, get in touch – I’d love to talk to you!

    • chevron_right

      Richard Hughes: Looking for LogoFAIL on your local system

      news.movim.eu / PlanetGnome • 9 January, 2024 • 4 minutes

    A couple of months ago, Binarly announced LogoFAIL which is a pretty serious firmware security problem. There is lots of complexity Alex explains much better than I might, but essentially the basics are that 99% of system firmware running right now is vulnerable: The horribly-insecure parsing in the firmware allows the user to use a corrupted OEM logo (the one normally shown as the system boots) to run whatever code they want, providing a really useful primitive to do basically anything the attacker wants when running in a super-privileged boot state.

    Vendors have to release new firmware versions to address this, and OEMs using the LVFS have pumped out millions of updates over the last few weeks.

    So, what can we do to check that your system firmware has been patched [correctly] by the OEM? The only real way we can detect this is by dumping the BIOS in userspace, decompressing the various sections and looking at the EFI binary responsible for loading the image. In an ideal world we’d be able to look at the embedded SBoM entry for the specific DXE, but that’s not a universe we live in yet — although it is something I’m pushing the IBVs really hard to do . What we can do right now is token matching (or control flow analysis) to detect the broken and fixed image loader versions.

    The four decompressing the various sections words hide how complicated taking an Intel Flash Descriptor image and breaking it into EFI binaries actually is. There are many levels of Matryoshka doll stacking involving hideous custom LZ77 and Huffman decompressors, and of course vendor-specific section types. It’s been several programmer-months spread over the last few years figuring it all out. Programs like UEFITool do a very good job, but we need to do something super-lightweight (and paranoid) at every system boot as part of the HSI tests. We only really want to stream a few kBs of SPI contents, not MBs as it’s actually quite slow and we only need a few hundred bytes to analyze.

    In Fedora 40 all the kernel parts are in place to actually get the image from userspace in a sane way. It’s a 100% read-only interface, so don’t panic about bricking your system. This is currently Intel-only — AMD wasn’t super-keen on allowing userspace read access to the SPI, even as root — even though it’s the same data you can get with a $2 SPI programmer and 30 seconds with a Pomona clip .

    Intel laptop and servers should both have an Intel PCI SPI controller — but some OEMs manually hide it for dubious reasons — and if that’s the case there’s nothing we can do I’m afraid.

    You can help the fwupd project by contributing test firmware we can use to verify we parse it correctly, and to prevent regressions in the future . Please follow these steps only if:

    1. You have an Intel CPU laptop, desktop or server machine
    2. You’re running Fedora 39, (no idea on other distros, but you’ll need at least CONFIG_MTD_SPI_NOR , CONFIG_SPI_INTEL_PCI and CONFIG_SPI_MEM to be enabled in the kernel)
    3. You’re comfortable installing and removing a kernel on the command line
    4. There’s not already a test image for the same model provided by someone else
    5. You are okay with uploading your SPI contents to the internet
    6. You’re running the OEM-provided firmware, and not something like coreboot
    7. You’re aware that the firmware image we generate may have an encrypted version of your BIOS supervisor password (if set) and also all of the EFI attribute keys you’ve manually set, or that have been set by the various crash reporting programs.
    8. The machine is not a secure production system or a machine you don’t actually own.

    Okay, lets get started:

    sudo dnf update kernel --releasever 40
    

    Then reboot into the new kernel, manually selecting the fc40 entry on the grub menu if required. We can check that the Intel SPI controller is visible.

    $ cat /sys/class/mtd/mtd0/name 
    BIOS
    

    Assuming it’s indeed BIOS and not some other random system MTD device, lets continue.

    $ sudo cat /dev/mtd0 > lenovo-p1-gen4.bin
    

    The filename should be lowercase, have no spaces, and identify the machine you’re using — using the SKU if that’s easier.

    Then we want to compress it (as it will have a lot of 0xFF padding bytes) and encrypt it (otherwise github will get most upset that you’re attaching something containing “binary code” ):

    zip lenovo-p1-gen4.zip lenovo-p1-gen4.bin -e
    Enter password: fwupd
    Verify password: fwupd
    

    It’s easier if you use the password of “ fwupd ” (lowercase, no quotes) but if you’d rather send the image with a custom password just get the password to me somehow. Email, mastodon DM, carrier pigeon, whatever.

    If you’re happy sharing the image, can you please create an issue and then attach the zip file and wait for me to download the file and close the issue. I also promise that I’m only using the provided images for testing fwupd IFD parsing, rather than anything more scary.

    Thanks!

    • chevron_right

      Andy Wingo: missing the point of webassembly

      news.movim.eu / PlanetGnome • 8 January, 2024 • 9 minutes

    I find most descriptions of WebAssembly to be uninspiring: if you start with a phrase like “assembly-like language” or a “virtual machine”, we have already lost the plot. That’s not to say that these descriptions are incorrect, but it’s like explaining what a dog is by starting with its circulatory system. You’re not wrong, but you should probably lead with the bark.

    I have a different preferred starting point which is less descriptive but more operational: WebAssembly is a new fundamental abstraction boundary . WebAssembly is a new way of dividing computing systems into pieces and of composing systems from parts.

    This all may sound high-falutin´, but it’s for real: this is the actually interesting thing about Wasm.

    fundamental & abstract

    It’s probably easiest to explain what I mean by example. Consider the Linux ABI: Linux doesn’t care what code it’s running; Linux just handles system calls and schedules process time. Programs that run against the x86-64 Linux ABI don’t care whether they are in a container or a virtual machine or “bare metal” or whether the processor is AMD or Intel or even a Mac M3 with Docker and Rosetta 2. The Linux ABI interface is fundamental in the sense that either side can implement any logic, subject to the restrictions of the interface, and abstract in the sense that the universe of possible behaviors has been simplified to a limited language, in this case that of system calls.

    Or take HTTP: when you visit wingolog.org , you don’t have to know (but surely would be delighted to learn) that it’s Scheme code that handles the request. I don’t have to care if the other side of the line is curl or Firefox or Wolvic . HTTP is such a successful fundamental abstraction boundary that at this point it is the default for network endpoints; whether you are a database or a golang microservice, if you don’t know that you need a custom protocol, you use HTTP.

    Or, to rotate our metaphorical compound microscope to high-power magnification, consider the SYS-V amd64 C ABI: almost every programming language supports some form of extern C {} to access external libraries, and the best language implementations can produce artifacts that implement the C ABI as well. The standard C ABI splits programs into parts, and allows works from separate teams to be composed into a whole. Indeed, one litmus test of a fundamental abstraction boundary is, could I reasonably define an interface and have an implementation of it be in Scheme or OCaml or what-not: if the answer is yes, we are in business.

    It is in this sense that WebAssembly is a new fundamental abstraction boundary.

    WebAssembly shares many of the concrete characteristics of other abstractions. Like the Linux syscall interface, WebAssembly defines an interface language in which programs rely on host capabilities to access system features. Like the C ABI, calling into WebAssembly code has a predictable low cost. Like HTTP, you can arrange for WebAssembly code to have no shared state with its host, by construction.

    But WebAssembly is a new point in this space. Unlike the Linux ABI, there is no fixed set of syscalls: WebAssembly imports are named, typed, and without pre-defined meaning, more like the C ABI. Unlike the C ABI, WebAssembly modules have only the shared state that they are given; neither side has a license to access all of the memory in the “process”. And unlike HTTP, WebAssembly modules are “in the room” with their hosts: close enough that hosts can allow themselves the luxury of synchronous function calls, and to allow WebAssembly modules to synchronously call back into their hosts.

    applied teleology

    At this point, you are probably nodding along, but also asking yourself, what is it for ? If you arrive at this question from the “WebAssembly is a virtual machine” perspective, I don’t think you’re well-equipped to answer. But starting as we did by the interface, I think we are better positioned to appreciate how WebAssembly fits into the computing landscape: the narrative is generative, in that you can explore potential niches by identifying existing abstraction boundaries.

    Again, let’s take a few examples. Say you ship some “smart cities” IoT device, consisting of a microcontroller that runs some non-Linux operating system. The system doesn’t have an MMU, so you don’t have hardware memory protections, but you would like to be able to enforce some invariants on the software that this device runs; and you would also like to be able to update that software over the air. WebAssembly is getting used in these environments; I wish I had a list of deployments at hand, but perhaps we can at least take this article last year from a WebAssembly IoT vendor as proof of commercial interest.

    Or, say you run a function-as-a-service cloud, meaning that you run customer code in response to individual API requests. You need to limit the allowable set of behaviors from the guest code, so you choose some abstraction boundary. You could use virtual machines, but that would be quite expensive in terms of memory. You could use containers, but you would like more control over the guest code. You could have these functions written in JavaScript, but that means that your abstraction is no longer fundamental; you limit your applicability. WebAssembly fills an interesting niche here, and there are a number of products in this space, for example Fastly Compute or Fermyon Spin .

    Or to go smaller, consider extensible software, like the GIMP image editor or VS Code : in the past you would use loadable plug-in modules via the C ABI, which can be quite gnarly, or you lean into a particular scripting language, which can be slow, inexpressive, and limit the set of developers that can write extensions. It’s not a silver bullet, but WebAssembly can have a role here. For example, the Harfbuzz text shaping library supports fonts with an embedded (em-behdad?) WebAssembly extension to control how strings of characters are mapped to positioned glyphs.

    aside: what boundaries do

    They say that good fences make good neighbors, and though I am not quite sure it is true—since my neighbor put up a fence a few months ago, our kids don’t play together any more—boundaries certainly facilitate separation of functionality. Conway’s law is sometimes applied as a descriptive observation—ha-ha, isn’t that funny, they just shipped their org chart—but this again misses the point, in that boundaries facilitate separation, but also composition: if I know that I can fearlessly allow a font to run code because I have an appropriate abstraction boundary between host application and extension, I have gained in power. I no longer need to be responsible for every part of the product, and my software can scale up to solve harder problems by composing work from multiple teams.

    There is little point in using WebAssembly if you control both sides of a boundary, just as (unless you have chickens) there is little point in putting up a fence that runs through the middle of your garden. But where you want to compose work from separate teams, the boundaries imposed by WebAssembly can be a useful tool.

    narrative generation

    WebAssembly is enjoying a tail-wind of hype, so I think it’s fair to say that wherever you find a fundamental abstraction boundary, someone is going to try to implement it with WebAssembly.

    Again, some examples: back in 2022 I speculated that someone would “compile” Docker containers to WebAssembly modules , and now that is a thing .

    I think at some point someone will attempt to replace eBPF with Wasm in the Linux kernel; eBPF is just not as good a language as Wasm, and the toolchains that produce it are worse. eBPF has clunky calling-conventions about what registers are saved and spilled at call sites, a decision that can be made more efficiently for the program and architecture at hand when register-allocating WebAssembly locals. (Sometimes people lean on the provably-terminating aspect of eBPF as its virtue, but that could apply just as well to Wasm if you prohibit the loop opcode (and the tail-call instructions) at verification-time.) And why don’t people write whole device drivers in eBPF? Or rather, targetting eBPF from C or what-have-you. It’s because eBPF is just not good enough. WebAssembly is, though! Anyway I think Linux people are too chauvinistic to pick this idea up but I bet Microsoft could do it.

    I was thinking today, you know, it actually makes sense to run a WebAssembly operating system, one which runs WebAssembly binaries. If the operating system includes the Wasm run-time, it can interpose itself at syscall boundaries, sometimes allowing it to avoid context switches. You could start with something like the Linux ABI, perhaps via WALI , but for a subset of guest processes that conform to particular conventions, you could build purpose-built composition that can allocate multiple WebAssembly modules to a single process, eliding inter-process context switches and data copies for streaming computations. Or, focussing on more restricted use-cases, you could make a microkernel; googling around I found this article from a couple days ago where someone is giving this a go .

    wwwhat about the wwweb

    But let’s go back to the web, where you are reading this. In one sense, WebAssembly is a massive web success, being deployed to literally billions of user agents. In another, it is marginal: people do not write web front-ends in WebAssembly. Partly this is because the kind of abstraction supported by linear-memory WebAssembly 1.0 isn’t a good match for the garbage-collected DOM API exposed by web browsers. As a corrolary, languages that are most happy targetting this linear-memory model (C, Rust, and so on) aren’t good for writing DOM applications either. WebAssembly is used in auxiliary modules where you want to run legacy C++ code on user devices, or to speed up a hot leaf function, but isn’t a huge success.

    This will change with the recent addition of managed data types to WebAssembly, but not necessarily in the way that you might think. Like, now that it will be cheaper and more natural to pass data back and forth with JavaScript, are we likely to see Wasm/GC progressively occupying more space in web applications? For me, I doubt that progressive is the word. In the same way that you wouldn’t run a fence through the middle of your front lawn, you wouldn’t want to divide your front-end team into JavaScript and WebAssembly sub-teams. Instead I think that we will see more phase transitions, in which whole web applications switch from JavaScript to Wasm/GC, compiled from Dart or Elm or what have you. The natural fundamental abstraction boundary in a web browser is between the user agent and the site’s code, not within the site’s code itself.

    conclusion

    So, friends, if you are talking to a compiler engineer, by all means: keep describing WebAssembly as an virtual machine. It will keep them interested. But for everyone else, the value of WebAssembly is what it does, which is to be a different way of breaking a system into pieces. Armed with this observation, we can look at current WebAssembly uses to understand the nature of those boundaries, and to look at new boundaries to see if WebAssembly can have a niche there. Happy hacking, and may your components always compose!

    • chevron_right

      Andy Wingo: scheme modules vs whole-program compilation: fight

      news.movim.eu / PlanetGnome • 5 January, 2024 • 8 minutes

    In a recent dispatch, I explained the whole-program compilation strategy used in Whiffle and Hoot . Today’s note explores what a correct solution might look like.

    being explicit

    Consider a module that exports an increment-this-integer procedure. We’ll use syntax from the R6RS standard :

    (library (inc)
      (export inc)
      (import (rnrs))
      (define (inc n) (+ n 1)))
    

    If we then have a program:

    (import (rnrs) (inc))
    (inc 42)
    

    Then the meaning of this program is clear: it reduces to (+ 42 1) , then to 43. Fine enough. But how do we get there? How does the compiler compose the program with the modules that it uses (transitively), to produce a single output?

    In Whiffle (and Hoot ), the answer is, sloppily. There is a standard prelude that initially has a number of bindings from the host compiler, Guile . One of these is + , exposed under the name %+ , where the % in this case is just a warning to the reader that this is a weird primitive binding. Using this primitive, the prelude defines a wrapper:

    ...
    (define (+ x y) (%+ x y))
    ...
    

    At compilation-time, Guile’s compiler recognizes %+ as special, and therefore compiles the body of + as consisting of a primitive call ( primcall ), in this case to the addition primitive. The Whiffle (and Hoot, and native Guile) back-ends then avoid referencing an imported binding when compiling %+ , and instead produce backend-specific code: %+ disappears. Most uses of the + wrapper get inlined so %+ ends up generating code all over the program.

    The prelude is lexically splatted into the compilation unit via a pre-expansion phase, so you end up with something like:

    (let () ; establish lexical binding contour
      ...
      (define (+ x y) (%+ x y))
      ...
      (let () ; new nested contour
        (define (inc n) (+ n 1))
        (inc 42)))
    

    This program will probably optimize (via partial evaluation) to just 43 . (What about let and define ? Well. Perhaps we’ll get to that.)

    But, again here I have taken a short-cut, which is about modules. Hoot and Whiffle don’t really do modules, yet anyway. I keep telling Spritely colleagues that it’s complicated, and rightfully they keep asking why, so this article gets into it.

    is it really a big letrec ?

    Firstly you have to ask, what is the compilation unit anyway? I mean, given a set of modules A , B , C and so on, you could choose to compile them separately, relying on the dynamic linker to compose them at run-time, or all together, letting the compiler gnaw on them all at once. Or, just A and B , and so on. One good-enough answer to this problem is library-group form, which explicitly defines a set of topologically-sorted modules that should be compiled together. In our case, to treat the (inc) module together with our example program as one compilation unit, we would have:

    (library-group
      ;; start with sequence of libraries
      ;; to include in compilation unit...
      (library (inc) ...)
    
      ;; then the tail is the program that
      ;; might use the libraries
      (import (rnrs) (inc))
      (inc 42))
    

    In this example, the (rnrs) base library is not part of the compilation unit. Presumably it will be linked in, either as a build step or dynamically at run-time. For Hoot we would want the whole prelude to be included, because we don’t want any run-time dependencies. Anyway hopefully this would expand out to something like the set of nested define forms inside nested let lexical contours.

    And that was my instinct: somehow we are going to smash all these modules together into a big nested letrec , and the compiler will go to town. And this would work, for a “normal” programming language.

    But with Scheme, there is a problem: macros. Scheme is a “programmable programming language” that allows users to extend its syntax as well as its semantics. R6RS defines a procedural syntax transformer (“macro”) facility, in which the user can define functions that run on code at compile-time (specifically, during syntax expansion). Scheme macros manage to compose lexical scope from the macro definition with the scope at the macro instantiation site, by annotating these expressions with source location and scope information, and making syntax transformers mostly preserve those annotations.

    “Macros are great!”, you say: well yes, of course. But they are a problem too. Consider this incomplete library:

    (library (ctinc)
      (import (rnrs) (inc))
      (export ctinc)
      (define-syntax ctinc
        (lambda (stx)
          ...)) // ***
    

    The idea is to define a version of inc , but at compile-time: a (ctinc 42) form should expand directly to 43 , not a call to inc (or even + , or %+ ). We define syntax transformers with define-syntax instead of define . The right-hand-side of the definition ( (lambda (stx) ...) ) should be a procedure of one argument, which returns one value: so far so good. Or is it? How do we actually evaluate what (lambda (stx) ...) means? What should we fill in for ... ? When evaluating the transformer value, what definitions are in scope? What does lambda even mean in this context?

    Well... here we butt up against the phasing wars of the mid-2000s. R6RS defines a whole system to explicitly declare what bindings are available when , then carves out a huge exception to allow for so-called implicit phasing , in which the compiler figures it out on its own. In this example we imported (rnrs) for the default phase, and this is the module that defines lambda (and indeed define and define-syntax ). The standard defines that (rnrs) makes its bindings available both at run-time and expansion-time (compilation-time), so lambda means what we expect that it does. Whew! Let’s just assume implicit phasing, going forward.

    The operand to the syntax transformer is a syntax object : an expression annotated with source and scope information. To pick it apart, R6RS defines a pattern-matching helper, syntax-case . In our case ctinc is unary, so we can begin to flesh out the syntax transformer:

    (library (ctinc)
      (import (rnrs) (inc))
      (export ctinc)
      (define-syntax ctinc
        (lambda (stx)
          (syntax-case stx ()
            ((ctinc n)
             (inc n)))))) // ***
    

    But here there’s a detail, which is that when syntax-case destructures stx to its parts, those parts themselves are syntax objects which carry the scope and source location annotations. To strip those annotations, we call the syntax->datum procedure, exported by (rnrs) .

    (library (ctinc)
      (import (rnrs) (inc))
      (export ctinc)
      (define-syntax ctinc
        (lambda (stx)
          (syntax-case stx ()
            ((ctinc n)
             (inc (syntax->datum #'n)))))))
    

    And with this, voilà our program:

    (library-group
      (library (inc) ...)
      (library (ctinc) ...)
      (import (rnrs) (ctinc))
      (ctinc 42))
    

    This program should pre-expand to something like:

    (let ()
      (define (inc n) (+ n 1))
      (let ()
        (define-syntax ctinc
          (lambda (stx)
            (syntax-case stx ()
              ((ctinc n)
               (inc (syntax->datum #'n))))))
        (ctinc 42)))
    

    And then expansion should transform (ctinc 42) to 43. However, our naïve pre-expansion is not good enough for this to be possible. If you ran this in Guile you would get an error:

    Syntax error:
    unknown file:8:12: reference to identifier outside its scope in form inc
    

    Which is to say, inc is not available as a value within the definition of ctinc . ctinc could residualize an expression that refers to inc , but it can’t use it to produce the output.

    modules are not expressible with local lexical binding

    This brings us to the heart of the issue: with procedural macros, modules impose a phasing discipline on the expansion process. Definitions from any given module must be available both at expand-time and at run-time. In our example, ctinc needs inc at expand-time, which is an early part of the compiler that is unrelated to any later partial evaluation by the optimizer. We can’t make inc available at expand-time just using let / letrec bindings.

    This is an annoying result! What do other languages do? Well, mostly they aren’t programmable, in the sense that they don’t have macros. There are some ways to get programmability using e.g. eval in JavaScript, but these systems are not very amenable to “offline” analysis of the kind needed by an ahead-of-time compiler.

    For those declarative languages with macros, Scheme included, I understand the state of the art is to expand module-by-module and then stitch together the results of expansion later, using a kind of link-time optimization. You visit a module’s definitions twice: once to evaluate them while expanding, resulting in live definitions that can be used by further syntax expanders, and once to residualize an abstract syntax tree, which will eventually be spliced into the compilation unit.

    Note that in general the expansion-time and the residual definitions don’t need to be the same, and indeed during cross-compilation they are often different. If you are compiling with Guile as host and Hoot as target, you might implement cons one way in Guile and another way in Hoot, choosing between them with cond-expand .

    lexical scope regained?

    What is to be done? Glad you asked, Vladimir. But, I don’t really know. The compiler wants a big blob of letrec , but the expander wants a pearl-string of modules. Perhaps we try to satisfy them both? The library-group paper suggests that modules should be expanded one by one, then stitched into a letrec by AST transformations. It’s not that lexical scope is incompatible with modules and whole-program compilation; the problems arise when you add in macros. So by expanding first, in units of modules, we reduce high-level Scheme to a lower-level language without syntax transformers, but still on the level of letrec .

    I was unreasonably pleased by the effectiveness of the “just splat in a prelude” approach, and I will miss it. I even pled for a kind of stop-gap fat-fingered solution to sloppily parse module forms and keep on splatting things together, but colleagues helpfully talked me away from the edge. So good-bye, sloppy: I repent my ways and will make amends, with 40 hail-maries and an alpha renaming thrice daily and more often if in moral distress. Further bulletins as events warrant. Until then, happy scheming!