Movim • Christian Hergert: mi2-glib

chevron_right

Christian Hergert: mi2-glib

news.movim.eu / PlanetGnome • 3 October • 7 minutes

At Red Hat we are expected to set, and meet, goals each quarter if we want our “full” bonus. One of those is around introducing AI into our daily work. You’ve probably seen various Red Hat employees talking about AI because there is financial incentive to do so.

Astute students of behavioral science know that humans work harder to not lose something than to gain something new. So perhaps for many it’s more about not losing that revenue stream. Arguably it’s only a “mandate to use AI” if you are entitled to the revenue so coming for your bonus is a convenient way to both “not be a mandate” and take advantage of the human behavior to not lose something.

Conveniently, I got pretty fed up with Debug Adapter Protocol (DAP for short) this week. So I wondered how hard it would be to just talk the MI2 protocol like Builder does (using the GPLv3+ gdbwire) but with a libdex-oriented library. It could potentially make integrating GDB into Foundry much easier.

Red Hat has so generously given us “ephemeral credits” at Cursor.com to play with so I fired up the cursor-agent command line tool and got started.

My goal here is just to make a minimal API that speaks the protocol and provides that as DexFuture with a very shallow object model. The result for that is mi2-glib . This post is mostly about how to go about getting positive results from the bias machine.

Writing a fully featured library in a couple hours is certainly impressive, regardless of how you feel about AI’s contributions to it.

Before I continue, I want to mention that this is not used in Foundry and the GDB support which landed today is using DAP. This was just an experiment to explore alternatives.

Create Project Scaffolding

To get the project started I used Foundry’s shared library template. It creates a C-based project with pkg-config, GObject Introspection support, Vala *.vapi generation, versioning with ABI mechanics, and gi-doc based documentation. Basically all the tricky things just work out of the box.

You could probably teach the agent to call this command but I choose to do it manually.

foundry template create library
Location[/home/christian]: .

The name for your project which should not contain spaces
Project Name: mi2-glib

The namespace for the library such as "Mylib"
Namespace: Mi2
 1: No License
 2: AGPL 3.0 or later
 3: Apache 2.0
 4: EUPL 1.2
 5: GPL 2.0 or later
 6: GPL 3.0 or later
 7: LGPL 2.1 or later
 8: LGPL 3.0 or later
 9: MIT
10: MPL 2.0
License[7]:
Version Control[yes]: 
mi2-glib/meson.options
mi2-glib/meson.build
mi2-glib/lib/meson.build
mi2-glib/testsuite/meson.build
mi2-glib/lib/mi2-glib.h
mi2-glib/lib/mi2-version.h.in
mi2-glib/lib/mi2-version-macros.h
mi2-glib/testsuite/test-mi2-glib.c
mi2-glib/doc/mi2-glib.toml.in
mi2-glib/doc/urlmap.js
mi2-glib/doc/overview.md
mi2-glib/doc/meson.build
mi2-glib/README.md
mi2-glib/LICENSE
mi2-glib/.foundry/.gitignore
mi2-glib/.foundry/project/settings.keyfile
mi2-glib/.git/objects
mi2-glib/.git/refs/heads
mi2-glib/.git/HEAD

Building Context

After that I added an AGENTS.md file that described how I want code to be written. I gave it some examples of my C style (basically opinionated GTK styling). That is things like preferring autoptr over manual memory management, how to build, how to test, etc. Consider this the “taste-making” phase.

The next thing I needed to do was to ensure that it has enough examples to be able to write libdex-based GObject code. So I copied over the markdown docs from libdex and some headers. Just enough for the agent to scan and find examples to prime how it should be used. Update your AGENTS.md to note where that documentation can be found and what it can expect to find there. That way the agent knows to look for more details.

Just as a matter of practicality, it would be great if we could have gi-doc generate markdown-formatted descriptions of APIs rather than just HTML.

Now that is all good and fun, but this is about mi2 now isn’t it? So we need to teach the agent about what mi2 is and how it works. Only then will it be able to write something useful for us.

GDB conveniently has a single-file version of the documentation so we need not piss off any network admins with recursive wget.

That left me with a big HTML document with lots of things that don’t matter. I wanted something more compact that will better fit into the context window of the agent. So I asked the agent to look over the HTML file and amalgamate a single gdb.md containing the documentation in a semi-structured format with markdown. That is of course still a lot of information.

Next, I went another level deeper where I asked it to extract from gdb.md information about the mi2 protocol. Any command in the mi2 protocol or related command in GDB is important and therefore should be included in a new mi2.md document. This will serve as the basis of our contextual knowledge.

Knowing the Problem Domain

Using an agent is no replacement for knowing the problem domain. Thankfully, I’ve written dozens of socket clients/servers using GIO primitives. So I can follow my normal technique.

First we create a client class which will manage our GIOStream . That may be a GSocketConnection or perhaps even a GSimpleIOStream containing memory streams or file streams to enable unit testing.

Then we move on to a GDataInputStream which can read the format. We ask the agent to subclass appropriately and provide a DexFuture based wrapper which can read the next message. It should have enough context at this point to know what needs to be read. Start simple, just reading the next message as a bag of bytes. We parse it into structured messages later.

After that do the same for a GDataOutputStream subclass. Once we have that we can ask the agent to wrap the input/output streams inside the client. All pretty easy stuff for it to get right.

Where I found it much easier to write things myself was dealing with the mechanics of a read-loop that will complete in-flight operations. Small price to pay to avoid debugging the slop machine.

After these mechanics are in place we can start on the message type. I knew from my previous debugger work in Builder that there would be a recursive result type and a toplevel container for them. I decided to name these Message and Result and then let the agent generate them. Even the first try had something “usable” even if I would say not very tasteful. It kept trying to do manual memory management. It left lots of “TODO” in place rather than actually doing the thing.

Instead of trying to describe to the agent how its implementation was ugly and incomplete I realized my time would be better spent letting it figure it out. So I shifted my focus towards a test suite.

Setting up a Test Suite

I saw the biggest improvement from the agent after I setup a test suite. This allowed the agent to run ninja test to check the results. That way it wasn’t just about satisfying GCC compiler warnings but also that it was functional.

I would start this as early as you can because it pays dividends. Make sure you provide both good and bad inputs to help guide it towards a better implementation.

The biggest win was creating tests for the input stream. Create different files with good and bad input. Since they wrap a base stream it is pretty easy to mock that up with a GFileInputStream or GMemoryOutputStream and verify the results. If the tests provide enough context through debug messages the agent can feed that back into why the error occurred. This made the iteration loop remarkably fast.

Another extremely valuable thing to do is to setup some type of sanitizer in your build configuration. I often build with meson configure -Db_sanitize=address to help catch memory errors. But also do things like assert object finalization in your testsuite to make sure that leaks don’t hide them.

Anyway, writing this post took about as long as designing the core parts of the library and that, I guess in some small way, is pretty neat.