A lot of teams have a good instinct for how to approach a web or network test:
- Start with scope and rules of engagement,
- Do some light recon,
- Map the attack surface,
- Then carefully validate anything that looks interesting.
But that knowledge often lives in people’s heads, scattered notes, and half‑remembered scripts. That works fine until you want to:
- Onboard new testers quickly,
- Bring in more automation, or
- Let “agents” (LLM‑driven or otherwise) handle the boring parts safely.
I’ve been working on turning that instinct into something much more explicit, but still private: a small, structured “manual” that humans and tools can both understand.
This post talks about the shape of that manual, not the internals.
The problem: humans improvise, agents need a map
Humans are good at reading between the lines:
- “This is production, so I probably shouldn’t hammer it.”
- “This endpoint smells like business logic; be gentle and think before fuzzing.”
- “We’re clearly off‑scope if we go here.”
Agents don’t have that gut feel.
If you want to safely delegate parts of an engagement, you need a way to encode intent:
- What’s allowed here?
- How “loud” am I allowed to be?
- Which flow should I follow for this kind of target?
- When must I stop and ask a person?
That’s what this manual is trying to solve.
Environments matter as much as targets
The manual also distinguishes between where you are:
- Production vs staging vs development vs lab.
- External perimeter vs internal network.
- Business‑critical vs low‑risk systems.
The same technique can be completely fine in a lab and totally unacceptable in production. Rather than trying to remember that ad‑hoc, each environment profile spells out:
- Default mode,
- Things that are never acceptable,
- Things that require explicit sign‑off.
This is the bridge between “we have a policy doc somewhere” and “the tooling actually behaves the way the policy expects”.
Guardrails and approval points
The riskiest parts of an engagement are usually obvious to a senior tester:
- Large password or MFA code spaces,
- High‑volume fuzzing,
- Anything involving big data pulls,
- Anything that writes or changes configuration.
The manual bakes those into a tiny list of “must ask first” moments.
When a human is driving, it’s just a reminder. When an agent is driving, it’s a hard stop:
“I’ve reached a potentially dangerous decision. Here’s what I’m thinking and why. Do you approve?”
That one pattern alone goes a long way towards making automated help feel like a safety net instead of a liability.
Why bother?
None of this replaces experience or judgement. You still need people who understand the business, the stack, and the threat model.
What it does give you is:
- A way to scale that experience across people and tools,
- A language you can share with agents without exposing your internal playbook,
- And a clearer separation between “what we do” and “how a given tool happens to implement it today”.
The manual I’ve been building is private by design, and it will stay that way. But the pattern is general:
- Define modes,
- Define workflows,
- Define environments,
- Define guardrails.
If you get those four right, the rest of your automation – including agent‑style workflows – has something solid to stand on.