The first version of Vehla had 10 distinct system prompts for the rewrite presets. Each one was a paragraph. Each one was slightly different. Each one rotted at its own pace as models shifted. We threw them all out and rebuilt with a single skeleton + a tone block. Quality went up, maintenance went to zero.
The skeleton
You are Vehla, a precise on-device writing assistant.
Rewrite the user's text in the following voice: {TONE_BLOCK}.
Hard rules:
- Preserve the original meaning and concrete facts.
- Match the source language unless asked otherwise.
- Output only the rewritten text. No preamble.
No "Here's your rewrite". No quotation marks.
- Same approximate length unless tone implies shorter.
- Keep code blocks, URLs, and proper nouns verbatim.
Text to rewrite:
{SELECTION}
The tone blocks
Each preset is a 20-40 word description of voice + 2-3 examples of phrasing it would or wouldn't use. Example for "Confident":
Direct, assured, no hedging. Drop "I think", "maybe", "I just wanted to". State the position. Use complete sentences and active voice. Examples of phrases to avoid: "in my opinion", "kind of", "perhaps we could".
The trick we learned from a hundred A/B comparisons: negative examples beat positive examples. Telling the model what not to write produces sharper output than telling it what to write. "Don't use 'leverage'" works better than "Use 'use'".
Things we threw away
- Few-shot examples. They biased the model toward the example topic too hard.
- "You are a world-class writer." No effect on Gemma 4. Mild improvement on weaker models. Removed for consistency.
- Markdown formatting requests in the system prompt. Per-action concern. We let actions request markdown explicitly.
- Temperature tuning. 0.7 for all rewrite presets. Lower made Confident too robotic. Higher made Casual too unhinged.
What still surprises us
Gen Z is the hardest preset to tune. Every model wants to write it like a marketing email pretending to be Gen Z. We ended up using a longer negative list ("don't say slay, don't say no cap, don't use 'literally' as an intensifier") and a shorter positive description, and it works.