← Writing

AI in practice9 min read

What I learnt about Code-First Design, from using Claude Design

I've been using Claude Design to understand where code-first design fits into modern software development.

Claude Design made me realise code-first design is not just about faster prototypes, but about where the hidden design judgment lives.

The prompts behind the prompts

After using it for a bit, I notice that Claude Design can consistently ship polished functional prototypes from the very first generation. This kind of consistent polish is very suspicious, and it’s not what you would expect from a single-shot user prompt.

Claude Design company intranet prototype
You can try the live prototype that was built with Claude Design here: Acme Intranet Prototype

The more I used it, the more I could see the prompt engineering underneath. For example, the tweaks panel (those sliders that let you adjust spacing, colour, and layout) was likely a preset component, steered by internal prompts that shape how the model structures its output.

Claude Design tweaks panel with red accent settings
For example, switching the color accent to red, and hiding certain widgets

When I moused over certain UI elements of Claude Design, I caught fragments of the prompting. Claude Design controls how the model outputs file content in a particular structure. Not just what to generate, but how to organise the generation itself.

Claude Design prompt fragment tooltip

Claude Design generated prototype and prompt panel

When experimenting with Claude Design, I mainly relied on Claude Opus 4.7, one of the strongest models for frontend visual generation right now. But Claude Design's quality isn't just Opus 4.7 being good at design. It's the harness around the model: the system prompts, the UI scaffolding, the preset components, the output structure, the rules about what to generate and how to format it. Everything the user doesn't see that sits between their request and the model's response.

In other words, the harness is doing a lot of the work as well.

But here’s when the dance stops

There's a practical problem, though. Claude Design is in research preview, and the usage limits are very harsh.

Claude Design usage limit message
I can feel your pain 😥

That was my experience too. Within 2 hours, I’ve used up all my usage for the entire week. Yes, I only had a $20 USD Claude plan. Sure, I was using Claude Opus 4.7, the most powerful model. But that doesn’t change how painful these limits feel.

If this were a client project with a deadline approaching, we can’t be having people being blocked by usage limits, right? Even if we use the $200 plan, what happens if they users get stuck after say, 2 days?

So, that pushed me to look at alternatives.

Open source to the rescue… or not?

Open Design is an open-source competitor to Claude Design. It's local-first, bring-your-own-AI alternative that runs on whatever coding agent you already use: Claude Code, Codex, Gemini CLI, and others. It ships with its own design systems, and prompt engineering (e.g. anti-AI-slop checklist).

Open Design interface
Open Design looks awfully similar to Claude Design 🤫

So, I tried it with the same starting prompt I'd used with Claude Design, but with other models (Chinese model GLM 4.6, and OpenAI’s GPT 5.5).

Open Design output with GLM 4.6
Open Design + GLM 4.6

Open Design output with GPT 5.5
Open Design + GPT 5.5

As you can see, the results were poor, compared to Claude Design.

Come on, that’s not a fair test

Yes I know, I changed two variables at once: the harness and the model. A proper experiment would only isolate one.

But yet, the designer or developer sitting with a deadline isn't going to run controlled experiments. Most won't want to swap models or compare outputs across model providers. They want to describe what they need and get something usable back.

We already know from early adopters that Claude Opus 4.7 and Gemini 3.1 Pro are ahead of the pack for frontend visuals generation. Leaning on that strength is a reasonable default. But even a strong model needs good direction, and direction is where the next piece comes in.

And hence, my search for “skills”

In the context of AI coding agents, a skill is a set of structured, pre-written prompts that load into the model's context before you say anything. Think of it as a briefing packet: instead of explaining what you want from scratch every time, the skill hands the model a playbook the AI model can reference. Skills can be scoped to a single task or cover an entire discipline.

If the AI model is the engine, and the harness is the car frame, then skills are the driver's manual taped to the dashboard. The engine has the power. The frame holds it together. But the manual is what tells the engine where to go and what to avoid.

That search led me to Impeccable.

Impeccable

Impeccable, created by Paul Bakaus, is a design skill for AI coding agents. It ships with a library of design reference files covering areas like typography, colour, motion, spatial design, interaction patterns, responsive behaviour, and UX writing. The agent picks which references to load based on the task at hand. Before you type a single instruction, the model has already been briefed on how professional design actually works.

Impeccable skill writing example
A small snippet into their very extensive skill writing

I looked into the prompt writing behind Impeccable, and it was impeccably thorough. Besides being really opinionated about many aspects of design, its additional value is in what it forbids. It catalogues 27 anti-patterns, the lazy defaults many model falls into: purple gradients, nested cards, low-contrast labels, etc. It names them and steers the AI to avoid them.

Impeccable website design example
An example from the Impeccable website

I would imagine that most layman will prompt with simple language like "make it look good and modern", because they don't have terms like tinted neutrals, vertical rhythm, or fluid type scale with optical sizing. Impeccable closes that vocabulary gap. The design capability was already in the model. The prompts were just too vague to reach it.

I tested this by pointing Claude Code and Codex at my personal site with Impeccable loaded as a skill. After several rounds of direction and iteration, the results were strong. Definitely not a single-prompt kind of magic, but still amazing after several back-and-forth.

Personal site before redesign
Before

Personal site after redesign
After

This unlocks a new path forward, by pairing AI coding tools (Claude Code or Codex) with agents skills like Impeccable for design work. That's code-based design without the lock-in to Claude Design. Of course, you won’t get some of niceties of Claude Design (e.g. being able to draw & annotate on the website), but at least there’s a way forward.

Claude Design's prompts are still gold though

I still think Claude Design produces the best output I've seen. So, I tried to find out why from the inside.

When I attempted to "distill" their prompts (to get the model to reveal the system instructions steering its output), I was blocked. The system refused to show the raw prompts, but it offered to paraphrase them, which I gladly accepted.

Claude Design prompt files list
Oh my gosh the number of files

Look at the amount of prompt / skill files there are! Different prompts for typography, component hierarchies, colour palettes, file formatting. And this is just a small portion, because it stopped halfway due to, you guessed it, usage limits. 🤦

That's Anthropic's real advantage with Claude Design. The accumulated prompt engineering sitting between user and model, invisible to most users, doing most of the work. While the Opus 4.7 model matters (a lot), the very capable output I saw was also due to all these harness-related prompts and pre-made components. The harness is what separates Claude Design from say, Claude Code.

So yes, they both matter

The prompts (Claude Design's internal system, Impeccable's vocabulary layer) and the model (Opus 4.7's raw capability) each contribute to great design. You can't get Claude Design-level polish from a weaker model, even with great prompts. And you can't get it from Opus 4.7 alone, without the harness directing it.

The harness shapes what the model attempts. The model determines what it can execute. And the human in the loop, with their taste and vision, helps to shape the output.

Is code-first design really it though?

Code-first design is one piece of the emerging AI Development Life Cycle (AI-DLC). Because LLMs make code cheap, you can design by generating real, functional interfaces instead of static mockups. Claude Design embodies this. Open Design tries to democratise it. Impeccable provides the vocabulary that makes output professional instead of generic.

But is code-first design really that much better than Figma? In raw speed, probably not. A skilled designer can mock up screens and wire a clickable prototype in an afternoon too.

The difference is in what the prototype can do. A Figma prototype is a series of connected screenshots. It can show layout and flow, but it can't easily show what happens when a form submits, a spinner appears for 1.2 seconds, and real data that populates a table. It can't simulate a dashboard handling 3 items versus 3,000. It can't save your input and load it back. A code-first prototype can.

So, the real value of code-first design isn't speed. It's that the prototype is more real.

Speed and stability

But "more real" brings its own problems. Tuhin Nair describes two loops in any business: one for speed and uncertainty reduction (ship, learn, iterate) and one for stability and complexity management (keep things working, keep them understandable, so payment keeps coming in). Code-first design tools accelerate the first loop dramatically. They do nothing for the second. A working prototype generated in an afternoon still needs to be understood, maintained, and evolved by real people for service stability.

Two feedback loops diagram by Tuhin Nair
The two feedback loops as illustrated by Tuhin Nair

The goal isn't to pick one loop. We need to amplify both. Experience-focused people (designers, product managers, founders) should move fast and test ideas with real, working prototypes. Foundation-focused people (senior engineers, architects) should slow down and make deliberate decisions about what to keep, refactor, or discard.

And there's a parallel worth noticing. In Claude Design, carefully written prompts and pre-made components provide the foundation for great output. The user moves fast on top of a foundation that was built with care.

Perhaps a vibe-coded prototype can serve the same role for a stable system: not the final product, but a real, interactive spec that engineers (and their AI agents) can inspect, pull apart, and rebuild properly. The prototype captures intent in a way a static mockup never could. The stability work that follows is better for having started from something real.

The tools will keep improving. The speed ceiling will keep rising. But the need for stability hasn't changed, and that's the gap where the real work of AI-DLC lives.