Blog

Articles about AI engineering, software architecture, and building AI-powered products.

Molekularny harness dla Claude Code: jak prowadzę rozbudowane projekty AI

Apr 19, 2026

MolekularnyharnessdlaClaudeCode:jakprowadzęrozbudowaneprojektyAI

Osobisty zestaw reguł i artefaktów, który zamienia Claude Code z „szybkiego chłopaka, który psuje rzeczy" w partnera, z którym można prowadzić wielomiesięczne projekty bez utraty kontroli. Siedem zasad nienegocjowalnych, trzy ścieżki decyzyjne, ciągłość między sesjami — wszystko open-source na GitHubie.

harnessclaude-codeworking-mode
Instrukcja spec-driven konfiguracji Open-Claw przy pomocy Claude Code

Apr 1, 2026

Instrukcjaspec-drivenkonfiguracjiOpen-ClawprzypomocyClaudeCode

# I went grocery shopping. I came back to a running AI server. *Michał Madejski — 2026-04-14* --- ## TL;DR I gave Claude Code one sentence, left the house for 50 minutes, and came back to a fully installed AI agent, a hardened security configuration, server documentation, an SSH tunnel to the dashboard, and a Raspberry Pi deployment plan. This is not an article about AI being magic. It's an article about **how to think about AI as an engineering partner** — and what actually happens under the hood when you give it free rein. --- ## The adventure log I had an evening, an idle server, and a long-postponed idea to run OpenClaw on it — a self-hosted AI agent. I had a rough idea of what I wanted, but no patience for hours of configuration. It started trivially. The very first thing I asked Claude was (in Polish, because that's what came naturally): > *"what do i type on linux to get the address for ssh connection"* `hostname -I`. That's it. But instead of taking the answer and continuing on my own — I left Claude at the keyboard. I told it to connect, document what works and what doesn't, and build me a repository that would be the *single source of truth* for the whole project. And then, once I had a sense it knew what it was doing: > *"ok, going grocery shopping for 50 min, when i come back i want openclaw > running as much as possible and telegram configured + raspberry pi planned > + dashboard on my mac. push through, don't ask me anything, don't blow up > my house"* And I left. Fifty minutes later I came back. Waiting on the screen: a HANDOFF.md document with a checklist of what to verify on return, a working OpenClaw dashboard in the browser through an SSH tunnel, and zero critical findings in the security audit. I'm sitting here now trying to describe what actually happened — because that's more interesting than the outcome. --- ## The prompt that made it work — and why it worked First, an important point: that one-sentence grocery-run prompt **was not the first prompt**. It was preceded by 20–30 minutes of conversation in which: 1. I described the project goal and the server context 2. I laid out the philosophy of how I wanted the project organized ("if it isn't a script, it doesn't exist") 3. I gave Claude SSH access and let it independently run a recon of the server That matters. **An autonomous prompt only works well if the model has built up context beforehand.** "Go grocery shopping" worked because Claude already knew: - what the server looked like (hardware, OS, what was installed) - what my security priorities were (I have a CLAUDE.md with a threat model) - how I wanted the code organized (IaC-lite, idempotent bash scripts) - what was in scope and what wasn't (no external skills without code review, don't touch LocalAI) If I had said "install an AI agent on my server, don't ask questions" without that context — I would have gotten something, just not what I wanted. **The idealized version of that prompt** — if I were to do it from scratch — would look like this: ``` You have 45 minutes of autonomous work. End goal: - OpenClaw installed on bluedemon and running as a systemd service - Hardening: exec.ask=always, deny-by-default, messaging profile - Dashboard accessible via SSH tunnel on the Mac (don't expose the port publicly) - Telegram bot and Raspberry Pi plan in docs/ and scripts/, even if you can't run them without me - HANDOFF.md I can read when I get back Constraints — do not cross: - Don't touch LocalAI (4 crashed Docker containers — separate topic) - Don't install any external skills - If something is unclear — pick the better of two options and document why. Don't ask me. When done — stop and wait. ``` The difference: **explicit constraints** and **an instruction for handling ambiguity** (decide yourself, don't escalate). That's probably the most important element of a good autonomous prompt — tell the model how to handle edge cases, not just what to do on the happy path. --- ## Before anything was installed — the snapshot and folder philosophy Before Claude touched the installation, I asked for something that doesn't normally happen before "running the tutorial": **a full reconnaissance of the existing state**. The result: a script `01-recon.sh` that collected a server snapshot over SSH — hardware, processes, open ports, installed packages, Docker state, filesystem layout — and saved everything as plain text files in the `snapshot/` directory. Read-only. Zero changes on the server. Pure observation. ``` open-claw/ ├── CLAUDE.md ← my original briefing (untouched) ├── docs/ ← prose: what, why, troubleshooting │ ├── 00-ssh-setup.md ← how Claude connected │ ├── 01-server-baseline.md ← snapshot analysis │ ├── 02-openclaw-install.md │ └── HANDOFF.md ← what to check on return ├── snapshot/ ← captured server state (in git) │ ├── hw-os.txt │ ├── gpu.txt │ ├── network.txt │ ├── services.txt │ └── INDEX.md └── scripts/bootstrap/ ← idempotent bash scripts ├── 01-recon.sh ├── 02-install-openclaw.sh ├── 03-harden.sh └── 04-tunnel-dashboard.sh ``` The philosophy I imposed at the start: **if it isn't a script, it doesn't exist**. Every repeatable operation must be a script. Every decision that could be forgotten must be in docs. Nothing we "did by hand and remember." Claude took this literally. It wrote scripts that are *idempotent* — you can run them multiple times and they always converge to the same state. That's not obvious when you're writing bash for the first time. Idempotency requires thinking "what if this is already installed?" at every step. Example from `02-install-openclaw.sh`: ```bash # Instead of simply: npm install -g openclaw # The script checks first: if openclaw --version 2>/dev/null | grep -q "2026"; then ok "openclaw already installed, skipping" else npm install -g openclaw fi ``` The second key principle: **the snapshot goes into git, but the scripts scrub secrets from it before writing**. API tokens, passwords, keys — never in `snapshot/`. Values replaced with `[REDACTED]`. This means the server state is documented historically (git blame shows how it evolved), while the repository stays safe to share. The effect: by the time I left for the grocery store, Claude had a **map of the terrain** before changing anything. It knew port 18789 was free, that LocalAI wasn't running (even though the briefing said otherwise), that nvm wasn't installed, that linger was disabled. It wasn't guessing — it was reading data. --- ## Three technically interesting discoveries ### 1. The documentation was lying — the model caught it I have a `CLAUDE.md` file in the project — a "briefing" for the model that describes the architecture, security posture, and technology stack. One line said: *"LocalAI is already running on this server and using the GPU."* Not true. The server recon via SSH showed 4 Docker containers in `Exited` or `Created` state — never started. GPU idle. No model was loading. Claude didn't challenge CLAUDE.md verbally. Instead, it wrote in the docs: > *"CLAUDE.md states that LocalAI is running, but the snapshot says otherwise. > My recommendation: skip the provider during onboarding — LocalAI debugging > is a separate topic."* And moved on. **A good model doesn't update its world-view just because something is written in the context — it verifies against observation.** That's exactly the property you want in an autonomous agent. --- ### 2. Schema drift — how to discover an API without documentation OpenClaw has a configuration file with thousands of possible keys. My CLAUDE.md contained configuration paths from a previous version of the software: ``` # old (CLAUDE.md): agents.defaults.exec.ask = always # current (v2026.4.14): tools.exec.ask = always ``` The problem: the installed version was 2026.4.14, the briefing referenced an older schema. Classic *schema drift* — documentation lagging behind the software. How did Claude catch it? It used `--dry-run`: ```bash openclaw config set --batch-json --dry-run '{"agents.defaults.exec.ask": "always"}' # → Error: Unrecognized key: exec (in agents.defaults) ``` Then it iteratively probed the schema: ```bash openclaw config schema | grep -i exec # → tools.exec.ask, tools.exec.security, ... ``` The methodology: **dry-run before writing. Instead of reading 48,000 lines of schema — grep and probe.** This is what any engineer does instinctively with an unfamiliar API, but it's non-obvious that a model will do the same without being told. --- ### 3. How to pass a password safely through SSH to a script The problem: I want a bash script to run `sudo` commands on a remote server over SSH. `sudo` requires a password. I don't want the password to appear: - in process arguments (visible in `ps aux`) - in a file (persists on disk) - in logs The solution Claude proposed: ```bash # Environment variable (not exported to children) REMOTE_SUDO_PW='...' sudo_remote() { ssh bluedemon "echo '${REMOTE_SUDO_PW}' | sudo -S $*" } ``` Why `echo | sudo -S` rather than `sudo -S <<< "$PW"`: - `sudo -S` reads the password from stdin - `echo |` pipes it — doesn't appear in `/proc/<pid>/cmdline` - The variable isn't exported — no subprocess inherits it It's not perfect (the password lives in process memory, `set -x` would expose it), but it's the right trade-off for a script that runs locally over SSH and never lands in git. --- ## What this says about working with autonomous models Three observations I'm taking away from this session: **Context matters more than the prompt.** The quality of autonomous work is proportional to the quality of context you invested before. CLAUDE.md, the server recon, clear rules — all of that was an upfront investment that paid back in those 50 minutes. **The model should handle ambiguity, not escalate it.** The best instruction in autonomous mode is "if something is unclear — pick the better option and justify it." Escalating to the user is expensive and breaks autonomy. A good model knows when to decide on its own. **Documentation as a first-class deliverable.** The most valuable output of this session isn't the running OpenClaw — it's HANDOFF.md and the docs/ directory. The deployment itself can be repeated from a script. The knowledge of *why* something was done a particular way — that's what normally disappears. --- ## What's next The system is running. I have a dashboard, a hardened configuration, and zero critical findings in the security audit. Next steps are the Telegram bot (waiting on a token from BotFather) and a Raspberry Pi with a "Jarvis" wake-word in the living room — the full architecture is already in the repository, just waiting on hardware. I'm maintaining the infrastructure repository with repeatability in mind — every script is idempotent, every decision is documented. If the server dies — reinstalling from scratch should take as long as it takes to read `docs/` and run the scripts. If you have questions or want to dig into any of the technical threads — reach out. I'm particularly happy to talk about security models for AI agents and how to design context for autonomous sessions. --- *Michał Madejski* *AI Possibilities Lab*

AIOpen-ClawAutomatyzacje