Shipping a macOS App with AI-Governed Development
Actions: From Gap Identification to Production Product
24 beta releases in 18 days. Architecture migrations, security audits, refactoring passes, a freemium business model, and Apple code signing — all built by one person with an AI agent, governed by documents that make the speed possible.
beta releases in 18 days
developer
one-time Pro price
development methodology
The Short Version
I identified a gap in developer tooling and shipped a product into it.
Terminal commands live in shell history, scattered notes, and memory. Existing tools — Alfred, Raycast, Keyboard Maestro — are either too broad, too complex, or not terminal-native. So I designed and shipped a macOS menu bar app with a freemium business model, Apple code signing and notarization, auto-updates, encrypted secrets, and a marketing site.
Built entirely in Claude Code using a documentation-first methodology where architecture documents are thorough enough that the AI generates correct implementations on the first attempt. The governance documents aren't project management overhead. They're the mechanism that keeps the AI agent productive across sessions, across refactors, and across architectural migrations.
This isn't a side project. It's a shipped product with a business model, a distribution pipeline, and a codebase maintained with the same discipline I'd apply to an enterprise system.
The Problem
Every developer's commands live in exactly three broken places.
Shell history
Ephemeral, unsearchable, lost when you clear it or switch machines. The command you need is somewhere between yesterday's typos and something you ran three weeks ago.
Scattered notes
A Notion page, a text file, a Slack message to yourself. You have commands documented in at least three places and can't remember which one has the deploy command.
Memory
Which fails at exactly the moment you need it. The flags, the paths, the arguments — you've run it a hundred times but right now you can't remember the exact syntax.
The existing tools don't solve this for terminal-focused developers.
Alfred and Raycast are general-purpose launchers where terminal workflow is buried under layers of functionality. Keyboard Maestro is powerful but complex — macro-oriented, not command-oriented. Shell aliases break down when you need variables, categorization, scheduling, or output capture. Apple Shortcuts has no real terminal integration.
The gap: a focused, terminal-native tool that gives your most-used commands a permanent home — without the overhead of a general-purpose automation platform.
Architecture Decisions
Every technical choice had a reason.
Electron, not native Swift.
The instinct for a macOS app is Swift and AppKit. I chose Electron for three reasons: Claude Code generates React/TypeScript implementations with higher first-pass accuracy than Swift. The feature surface — floating output windows, ANSI color, CodeMirror editors, rich modal UIs — is trivially available in a web stack. And a Windows port becomes feasible, not a rewrite.
The tradeoff is app size and memory footprint. For a solo product built under AI governance, the technology that produces correct implementations fastest wins.
IPC as a security boundary.
Every piece of communication between main process and renderer goes through a preload bridge. The renderer never sees Node.js APIs. IPC handlers validate inputs — accent colors checked against whitelists, file paths sanitized, URLs validated before opening.
This pattern — strict boundary enforcement between trust zones — is the same pattern that makes enterprise systems secure. The scale is different. The principle is identical.
JSON persistence, not a database.
Single JSON file with debounced writes (300ms), automatic backup before every write, and automatic recovery from corruption. Output isolation — run outputs stored as individual flat files, not embedded in the main data.
The store isn't clever. It's resilient. A user's data survives crashes, corrupt writes, and unexpected shutdowns because the architecture assumes those things will happen.
Keychain-encrypted secrets.
API keys and tokens encrypted at rest via macOS Keychain through Electron's safeStorage API. Decrypted at runtime, injected into action execution, never in plaintext on disk. The documentation is honest: for highly sensitive credentials, use dedicated vaults. For the 90% case, Keychain integration is the right trade.
Feature Design
Each feature represents a product design decision.
Leader key hotkeys
Global shortcuts pollute the system hotkey space and conflict at scale. Actions uses a Vim-inspired leader key — one activation shortcut enters a three-second listening mode with a HUD on all monitors. Press a single key to fire. Effectively unlimited hotkeys using exactly one global shortcut.
Variables and parameterization
Local variables per action with preset values. Global variables shared across actions. Runtime prompts with last-used value persistence. Auto-run values for non-interactive triggers. The variable system is what separates "I saved a command" from "I parameterized a workflow."
Floating popout buttons
Any action pops out as a persistent always-on-top floating button — visible even when the app is hidden. Click to run. Position persists across restarts. This eliminates the fundamental UX limitation of menu bar apps: having to open the app to do anything.
Auto-close on success
Output windows auto-close after a configurable delay when the action succeeds. User interaction cancels the countdown. The design philosophy: get out of your way when things work, stay visible when they don't.
The Process
AI governance at scale.
The CHANGELOG tells the real story. 24 beta releases shipped in under three weeks — most within days of each other, many on the same day. This velocity wasn't from cutting corners. It was from the documentation-first methodology making each feature cycle fast and correct.
A typical feature cycle: update ROADMAP.md with the spec — behavior, data model changes, UI, edge cases. Update ARCHITECTURE.md if patterns change. Implement with Claude Code against the spec. Review. Test. Ship.
The governance documents aren't overhead. They're what makes the speed possible.
The spec the AI agent builds against. Not a wish list — a contract.
The constraints the agent operates within. Patterns, structures, boundaries.
Working memory that survives context window compaction. Gotchas, conventions, lessons.
The audit trail. Every release documented, every decision traceable.
Migrations Under Governance
Sustained development means the architecture evolves. The governance system manages it.
Modal window migration
Vanilla JavaScript modals migrated to React-based modal-apps — each with its own Vite entry point, component tree, and shared infrastructure. A controlled migration that kept the app shippable throughout.
Output storage refactoring
Action outputs moved from the main data file to individual flat files after large outputs caused performance degradation. Automatic migration on first launch, orphan cleanup for deleted actions. The data model changed; the user experience didn't.
Dead code cleanup
After the modal migration, ~1,500 lines of dead code remained — orphaned components, unused state, dead CSS. A single focused cleanup pass removed it all. The ROADMAP tracked it. Exhaustive code review confirmed nothing was missed.
Security as Practice
Security wasn't a phase. It was continuous.
Each of these was a deliberate decision documented in the CHANGELOG, not a response to an incident. The governance system surfaces these concerns because the architecture documents force you to think about trust boundaries before you write the code.
The Business
Product management, not just engineering.
Free tier gets the core experience — saving and running commands, unlimited. Pro ($12 one-time) unlocks power features: floating popout buttons, leader key hotkeys, variables, encrypted secrets, scheduling, tray menu pinning.
The boundary is deliberate. Pro features indicate serious, sustained usage. One-time purchase, not subscription — positioned as "less than a month of Raycast Pro."
Hindsight
What I'd do differently.
Separate the store architecture earlier.
The single-file JSON store works, but the output storage refactoring proved that different data types need different persistence strategies. If starting over, I'd design for multiple storage files from day one — settings, actions, outputs, history — each with its own write cadence and backup strategy.
Automated testing before the modal migration.
The migration from vanilla JS to React modals was managed through manual testing and code review. It worked, but automated tests would have caught edge cases faster. The governance documents reduced risk significantly, but tests would have reduced it further.
Build the licensing system before shipping betas.
The current beta has Pro features unlocked. Building licensing last means early adopters experience the full product and then see features get gated — a worse experience than starting gated and unlocking.
The Design Insight
The methodology is the product.
The conventional narrative about AI-assisted development is speed: "I built this in a weekend with AI." Speed is a byproduct, not the point. The point is governance.
Actions wasn't built fast because AI writes code fast. It was built fast because the governance system — ROADMAP.md defining what to build, ARCHITECTURE.md defining how, CLAUDE.md defining how to work — eliminated the waste that normally slows development down. No re-reading the codebase. No inconsistent choices across sessions. No "the AI did something weird and I'm not sure why."
24 releases. Architecture migrations. Security audits. Refactoring passes. All managed by one person with an AI agent, governed by documents that keep the agent productive across sessions. That's not "I used AI to code faster." That's a development methodology.
Technical Stack
More case studies
See how I approach different kinds of problems — from crisis infrastructure rescue to enterprise platform architecture.