Software for myself

Thanks to the coding agents, it’s easier than ever to create small pieces of software.

I’ve been creating small apps, tools, and toys for myself. Last week I posted about Goblin Mode, my tool for spinning up development VMs. Let’s take a look at what else I’ve made!

Small web apps

I made a photo gallery for sharing my photos online. The app is called kuvasivu (Finnish for “picture page”) and it’s implemented in Rust. I’m self-hosting it on a Hetzner virtual server.

I’ve been meaning to put more of photos online, but there hasn’t been an obvious place for them. There are dozens of decent photo gallery projects and services but it felt easier to just prompt one into existence and self-host it.¹

I also made a web app for scheduling meetings with my friends. You create a poll with a few dates as options and everyone fills in their availability. If anyone still remembers Doodle, it’s like that but it does not have ads. It’s called beet-scheduler and it, too, is implemented in Rust.

Again, there are plenty of alternatives out there. You could use Tapaaminen.net (a service) for example. It’s probably AI-free.

Also, I made a dashboard that shows how often I climb and run. I record my runs on SmashRun and my climbs in a spreadsheet. The dashboard downloads the data directly from them. I like to use it to check that I’m exercising the Goldilocks amount.

Small games

The coding agents are pretty good at one-shotting small games. I like to try out new models by asking them to create a snake game.

I tried to create a QWOP-like game for paddling a kayak. Back in 2023, I did a nine-day kayaking trip on Lake Saimaa. It was an emotional rollercoaster, type 2 fun, and I wanted to make a game to commemorate it.

Unfortunately I didn’t get the LLM to understand how kayaks work quickly enough, and I couldn’t get the controls to click. The game was certainly frustrating but not especially fun. Type 3 fun maybe?

Implementing research

A while ago I heard about a data structure called Bw^e tree. It’s an evolution of B+ tree that is optimized for the current storage solutions. You could use it to implement a database index, for example.

The paper was published by Alibaba and they’re using Bw^e trees in production in some of their database services, but they haven’t open-sourced their implementation.

I thought that why don’t I simply prompt an implementation into existence. A couple of days with Claude Code and Opus 4.5 (this was in January) and I got some 10k lines of Rust. It even has YCSB-based benchmarks where it beats RocksDB, just like the paper said it would!

However… I don’t know how to verify the implementation. I have skimmed the paper, but I don’t have in depth understanding of it and 10k lines is a lot of code. For example, the structural modification operations like page split look subtle and they’re crucial for correct concurrent operation.

According to the paper, Alibaba’s C++ implementation is 33k lines which makes the 10k number suspiciously low.

I’m unlikely to publish the code as I don’t know what’s in there. If you need it, spend two days with Claude, or implement it by hand. You’ll do better job.

Speeding up compression algorithms

pi-autoresearch is an autoresearch plugin for the Pi agent. It prompts the agent to autonomously experiment to improve some numeric metric about the codebase.

That seems cool, so I tried it on floatbungler. Last year, I was studying lightweight float compression algorithms like Gorilla. I implemented a few of them (by hand! I wanted to understand them!) and put them into a Python library.

I didn’t consider the performance at all when I wrote them, so I figured out I could unleash pi-autoresearch on the codebase and it would find ways to speed them up. It sort of worked. You can see an example of what it did for Chimp128. It turned the code into a mess but the runtime benchmark did improve almost 3x.

There’s no need to make floatbungler faster, so I’m not going to merge the changes. Better keep it simple and understandable in case I want to look at it again.

Playing around with pi-autoresearch revealed some flaws in the implementations and the test suite, so I did fix those. The coding agents seem to be terrible at debugging the library, possibly because I haven’t built any debugging tools.

I did not notice attempts to cheat. The agent did try some changes that broke the algorithm, but after the tests failed, it rolled them back.

In conclusion

Crafting great products remains hard, but there’s a lot of fun in making software that’s good enough for exactly one person: yourself.

I’ve worked with software engineers who seem to have an urge to rewrite all the code they work on into their own style. Maybe I’m becoming one of them. ↩︎