Notes from building Vehla.
Less "AI is the future". More what we learned on the actual ship.
Why we made local-first the default — and what it cost us.
Cloud inference is faster, cheaper to ship, and easier to support. We made the harder choice. Here's the math, the missteps, and the moment we knew it was right.
Read the postNotes on Gemma 4 E4B at 4-bit on Apple Silicon.
How Vehla thinks about Gemma 4 models, default local inference, and the trade-offs between speed, quality, and download size.
Read the post10 rewrite tones, 1 prompt skeleton.
The structure behind every Rewrite — Professional, Casual, Confident, and the rest. What works, what flatters the model, and what we threw away.
Read the postDesigning a command palette you actually use.
Spotlight, Alfred, Raycast, Linear — every great palette has the same five details right. We rebuilt ours three times to get there.
Read the postWhat shipping v1.0 actually felt like.
What shipping a menu-bar AI assistant taught us about shortcuts, permissions, local models, and support.
Read the postWe removed telemetry.
What we learned about trust as a feature, and why Vehla does not collect prompts, outputs, or product analytics.
Read the post