LLM-Augmented DFIR-IRIS Case Templates: Embedding AI Prompts Directly in Your IR Reports
In a previous post I released a library of DFIR-IRIS case templates covering common incident types. Those templates give you a pre-built task list, structured note directories, and a report scaffold, but the actual narrative content still needs to be written by a human analyst at the end of a long and usually exhausting investigation.
I've been experimenting with a different approach: embedding structured LLM prompts directly inside the case template's summary field, so that when the investigation is complete, an AI can draft the report narrative from the case data automatically. This post describes the concept, shows how the prompts are structured, and discusses where it works well and where it still needs a human.
The Problem with IR Report Writing
DFIR-IRIS does a good job of structuring case data: tasks, notes, IOCs, assets, and timelines all live in one place by the time an investigation closes. The problem is that translating all of that structured data into a coherent written report, an executive summary, a MITRE ATT&CK analysis section, a CTI findings narrative, a conclusion, is time-consuming and cognitively expensive at exactly the moment when the team is most fatigued.
The standard case summary in IRIS is a free-text markdown field. The non-LLM version of my Ransomware template uses that field as a report scaffold with placeholder comments and empty tables for analysts to populate. That works, but it still requires an analyst to synthesise and write every narrative section from scratch.
The LLM-augmented version replaces those placeholder comments with explicit, tightly constrained prompts that tell an LLM exactly what to write, what data to use, and what not to fabricate.
How It Works
The template uses a simple {{AI_PROMPT: ...}} marker syntax embedded in the case summary markdown. Each marker contains a detailed instruction to an LLM, specifying what section to generate, what source data to use, what constraints apply, and how to handle uncertainty.
For example, the executive summary section looks like this:
And a tactic-level ATT&CK section looks like this:
The workflow is: complete the investigation in IRIS as normal, populating tasks, notes, IOCs, and assets throughout. When the case is ready to close, export or pass the full case data to an LLM alongside the template, and the prompts drive generation of each narrative section.
Prompt Design Principles
The prompts are written with a few consistent constraints that I found necessary to get useful output rather than hallucinated nonsense:
1. Ground every prompt in case data
Every prompt ends with some variation of: "Base all statements strictly on case data, do not infer or fabricate details not present in the source material." This is the most important constraint. Without it, LLMs will fill gaps with plausible-sounding but fabricated findings, which is worse than a blank field in an IR report.
2. Handle absence explicitly
Rather than leaving a section blank or populated with placeholders when a tactic had no observed activity, the prompts instruct the model to state the absence explicitly. For example, a tactic with no evidence should produce: "No activity was observed for this tactic during the investigation period." This is meaningful in a forensic report, it signals the tactic was considered, not overlooked.
3. Distinguish confirmed from suspected
Prompts for investigation sections explicitly ask the model to distinguish confirmed findings from current hypotheses, and to state confidence levels where attribution or scope is uncertain. IR reports that conflate confirmed evidence with working theories are a liability.
4. Audience-appropriate tone per section
The executive summary prompt specifically asks for non-technical language and prohibits unexplained jargon. The ATT&CK analysis prompts ask for technique IDs, evidence tables, and precise language. The conclusion prompt specifies a "clear, authoritative, and forward-looking" tone suitable for leadership. Each section has a different reader in mind.
5. Structural completeness
One prompt at the top of the ATT&CK section instructs the model to review all tactic subsections and replace any that remain blank or placeholder-filled with an explicit "no activity" statement. This prevents a half-populated report from going out.
What the Template Covers
The Ransomware LLM template generates AI-assisted drafts for the following report sections:
| Section | What the LLM drafts |
|---|---|
| Executive Summary | Non-technical 2–4 paragraph overview for leadership |
| Scope of Investigation | In-scope/out-of-scope definition, evidence limitations |
| ATT&CK Framework Intro | Overall attack chain summary, tactic coverage overview |
| TA0001 — Initial Access | Technique IDs, narrative, evidence |
| TA0002 — Execution | Technique IDs, tools/commands, narrative |
| TA0003 — Persistence | Technique IDs, artifact paths, remediation status narrative |
| TA0004 — Privilege Escalation | Technique IDs, accounts, privilege level narrative |
| TA0005 — Defense Evasion | Technique IDs, evasion actions, log coverage impact |
| TA0006 — Credential Access | Technique IDs, credential types, account scope |
| TA0007 — Discovery | Technique IDs, enumeration scope narrative |
| TA0008 — Lateral Movement | Technique IDs, movement sequence narrative |
| TA0011 — Command and Control | Technique IDs, C2 infrastructure and beaconing narrative |
| TA0010 — Exfiltration | Technique IDs, staging/transfer narrative, data types at risk |
| TA0040 — Impact | Technique IDs, encryption scope, recovery inhibition narrative |
| CTI Findings | Threat actor attribution narrative, confidence assessment |
| Threat Actor Profile | Group profile, TTP alignment, campaign links |
| Remediation Intro | Containment/eradication posture summary, gap framing |
| Conclusion | Attack chain summary, dwell time, attribution confidence, recovery path |
The evidence tables (per-tactic and per-section) remain as structured markdown for analysts to populate — the LLM handles prose, humans handle tabular evidence documentation.
What This Is Not
It is worth being direct about the limitations, because IR reporting is a context where the cost of error is high.
It does not investigate for you. The quality of the generated report is entirely dependent on the quality of the case data in IRIS. Poorly documented investigations produce poorly generated reports. The template does not compensate for gaps in the underlying investigation.
It does not replace analyst review. Every section marked {{AI_PROMPT: ...}} produces a first draft, not a final product. ATT&CK technique mappings, attribution statements, and exfiltration assessments in particular need human verification before going to stakeholders, legal, or regulators.
It does not handle data sensitivity automatically. If your IRIS case contains information that should not leave a particular environment, PII, attorney-client privileged communications, classified data, you need to think carefully about what you pass to any external LLM API. Run these against a locally hosted model or an enterprise API with appropriate data handling controls if that is a concern in your environment.
Where It Goes Next
The current implementation requires manually passing case data to an LLM alongside the template prompts. The logical next step is to wire this into an n8n workflow that pulls the completed case from IRIS via API, constructs the prompt payload automatically, and writes the generated report back to the case, or delivers it directly to a reporting pipeline. That integration is something I'm actively working on.
I'm also exploring the same prompt-embedding approach for other incident types in the library, particularly the Data Breach and Supply Chain templates, where the notification obligation sections and regulatory framework analysis are the most time-consuming parts of report writing.
Get the Templates
The LLM-augmented templates are available alongside the standard templates on GitHub at https://github.com/zach115th/DFIR-IRIS-Templates/tree/main/Templates/Case/LLM. Look for files with the LLM suffix in the display name.
The goal isn't to have an AI write your IR reports. The goal is to get a defensible first draft in front of a tired analyst in five minutes instead of two hours.