Building in Public: The Architecture of a Solo Rust Project

Name: Drengr
Author: Drengr

February 28, 202611 min read

PersonalEngineering

I'm a solo developer building an open source Rust project, and I want to talk about what that actually looks like. Not the polished "launched on Product Hunt and got 500 stars" version, but the real one — the architecture decisions made at midnight, the bugs that took days, and the strange irony of using AI to build AI tooling.

Drengr started as a research question: can I give an AI agent a phone? No venture capital, no team, no timeline pressure. Just curiosity and a problem that felt important enough to spend months on. Building in public means sharing the journey honestly, including the parts that don't look impressive.

Why I'm Building This Alone

The honest answer is that this project started before I knew it was a project. I was frustrated that Claude could write mobile code but couldn't interact with mobile devices. I hacked together a Python script that captured screenshots and sent them to the API with action instructions. It worked badly, but it worked. That script became a prototype, the prototype became an architecture, and the architecture demanded Rust.

At no point did I sit down and say "I'm going to build a product." I kept solving the next problem. The next problem kept being interesting. Six months later, I have about 6,300 lines of Rust, a working MCP server, and a tool that other developers are starting to use.

Solo development has real trade-offs. I don't have anyone to review my code. I don't have anyone to challenge my architectural decisions. When I make a mistake, there's no one to catch it until a user reports a bug. The upside is speed — I can refactor the entire transport layer on a Saturday without scheduling a meeting.

The Architecture

Drengr's architecture is built around one core abstraction: the transport layer.

Transport Trait

A single Rust trait defines what it means to "talk to a device":

trait Transport {
    fn capture_screen(&self) -> Result<Screenshot>;
    fn get_ui_tree(&self) -> Result<Vec<UiElement>>;
    fn execute_action(&self, action: Action) -> Result<()>;
    fn query_state(&self, query: Query) -> Result<StateResponse>;
}

Three implementations exist: ADB for Android devices and emulators, simctl for iOS simulators, and Appium for cloud device farms. Each speaks a completely different protocol. ADB uses shell commands and binary protocols. Simctl uses Apple's command-line tools. Appium uses HTTP/WebDriver.

The rest of the codebase doesn't know or care which one is active. The MCP handler, the OODA loop, the screen annotation system — they all work through the trait. Adding a new platform means implementing four methods.

MCP Handler

The MCP server reads JSON-RPC from stdin and writes responses to stdout. This sounds simple until you realize that the device interactions also write to stdout (ADB commands, for instance, produce output). One of my earliest architectural decisions was redirecting child process I/O to avoid polluting the MCP channel.

The handler routes incoming tool calls to one of three paths: drengr_look triggers a screen capture and UI tree extraction, drengr_do dispatches an action to the transport layer, and drengr_query reads state without side effects.

Screen Annotation

When the agent calls drengr_look, it doesn't just get a raw screenshot. Drengr extracts the UI hierarchy, identifies interactive elements, assigns each a number, and returns both the annotated information and the element metadata. The agent can then say "tap element 7" instead of "tap at coordinates (342, 891)."

This annotation system is more important than it might seem. It bridges the gap between how the AI perceives the screen (as a visual field) and how the device accepts input (as structured commands). Without it, every interaction requires the agent to estimate pixel coordinates from visual inspection, which is unreliable.

What 6,300 Lines of Rust Taught Me

The compiler is your strictest code reviewer. I've lost count of the number of times the borrow checker rejected code that I was confident was correct, only to realize on reflection that it was catching a real problem. Not always a bug — sometimes a design issue. "You can't hold a mutable reference to the transport while also iterating over its UI tree results" is the compiler's way of saying "your data flow is tangled."

If it compiles, it probably works. This cliche has limits — logic errors still exist, integration tests still matter — but the density of runtime bugs per line of code is lower than anything I've experienced in other languages. When I do hit a bug, it's almost always in my logic, not in my memory management, not in my error handling, and not in my concurrency model.

Ownership semantics forced better architecture. In Python or JavaScript, I'd have passed the transport connection around freely, probably storing references in three different places. Rust forced me to think about who owns the connection and who borrows it. That constraint produced a cleaner architecture than I would have designed voluntarily.

The Hardest Bug

MCP over stdio means Drengr reads JSON-RPC requests from stdin and writes responses to stdout. Simple enough — until you spawn an ADB shell command that also writes to stdout.

The first time this happened, the MCP client received a response that started with a valid JSON-RPC frame, continued with "List of devices attached," and then had another JSON-RPC frame. The client understandably choked.

The fix required redirecting all child process stdout to /dev/null or to a captured buffer, using os::unix::io and dup2 to manage file descriptors at the system call level. It's about 30 lines of code. It took me two full days to debug, because the symptoms were intermittent — ADB only writes to stdout under certain conditions, so the MCP corruption was sporadic.

This is the kind of bug that doesn't exist in simpler architectures. If Drengr were an HTTP server instead of a stdio server, the problem would never have arisen. But MCP over stdio is the standard for local tool servers, and for good reason — it's simpler for the client, requires no port management, and works inside sandboxed environments. The complexity is justified; the bug was the price of admission.

The Irony of Using AI to Build AI Tooling

I use Claude Code daily to work on Drengr. Claude helps me write the code that teaches Claude to use phones. The recursion is not lost on me.

It's genuinely productive. Claude is good at Rust — it understands ownership patterns, suggests idiomatic approaches, and catches issues I miss. When I was implementing the situation engine, Claude helped me think through the state comparison logic. When I was wrestling with async trait objects, Claude explained the Pin<Box<dyn Future>> pattern in a way that finally clicked.

The irony runs deeper, though. Every improvement I make to Drengr makes Claude slightly better at interacting with mobile devices. A better screen annotation system means Claude gets better information. A better situation engine means Claude makes fewer mistakes. I'm building a tool that improves the capability of the AI that helps me build the tool.

I don't think this is unique to my project. Every developer using AI to build AI tools is in this feedback loop. But working on it daily makes the loop very tangible.

What's on the Roadmap

Three things I'm actively working on:

Dashboard. A web interface for visualizing test runs, reviewing agent decisions, and correlating UI actions with network traffic. The technical spec is written; implementation is next.
Real-time steering. The ability to watch an agent run and redirect it mid-session. "Stop exploring settings, go test the checkout flow instead." This requires a WebSocket connection between the dashboard and the running Drengr process.
Network monitoring. An SDK that apps can integrate to capture network traffic during Drengr sessions. This lets the dashboard show what API calls happened alongside each UI action — invaluable for debugging integration issues.

How to Get Involved

Drengr is on GitHub at github.com/SharminSirajudeen/drengr. If you're interested in contributing, the areas where I'd most appreciate help are:

iOS transport layer. The simctl implementation is functional but less mature than the ADB transport. Anyone with deep iOS tooling experience would dramatically accelerate this.
Testing on diverse devices. I develop on a limited set of emulator configurations. Reports of how Drengr behaves on different Android versions, screen sizes, and manufacturer overlays are extremely valuable.
Prompt engineering for test scenarios. The quality of Drengr's autonomous testing depends heavily on how the goal is expressed. I'm collecting effective prompts and would love contributions.

Or just try it. curl -fsSL https://drengr.dev/install.sh | bash. Connect a device. Point Claude at it. Tell me what happens.

The best feedback isn't "great project." It's "I tried this and it broke." That's how the tool gets better.

Building in public means accepting that people will see the rough edges. I'm okay with that. The rough edges are where the interesting problems live.