The Problem Space
Traditional forensic triage cannot cope with the sheer volume of data generated during security breaches. Let's look at the numbers.
Live Global Security Portal
Interactive cyber threat monitoring illustrating real-time malware waves, intrusions, and security telemetry feeds across global networks.
System Architecture
End-to-end framework layers stacking from raw system volatile hardware collection up to incident reporting.
Flask Dashboard · Kibana Timelines · PDF Exporter
Interactive web user interface displaying parsed cases, dynamic event search parameters, and MITRE ATT&CK mapping reports.
Gaussian Model · Bayesian Probability · Shannon Entropy
Runs GMM anomaly scans, Bayesian posterior updates, Shannon entropy byte checks, and Time-Series decompositions.
Bayesian Threat Confidence Score (TCS) Module
Weights diverse artifacts across hosts and calculates an automated threat score ranking most compromised targets.
Volatility 3 · Autopsy CLI · Scapy PCAP parsing · YARA
Triggers volatility RAM decoders, parses disk registry hives, and parses network stream histories.
Logs · Memory dumps · Reg hives · Sysmon · Browser SQLite
Performs parallel automated extraction of all volatile and non-volatile evidence segments across target machines.
Automated triage, memory carving, and timeline reconstruction.
Active C2 beacon identification and lateral movement tracking.
Bayesian-driven telemetry correlation to reduce false positives.
Cryptographically checked reports aligned with legal admissibility.
Forensic Evidence Gathering
The framework automates collection and parsing across nine distinct digital forensic artifact segments.
Parses Windows Event Logs (.evtx), Linux syslog and auth logs using Python log normalizers and Elasticsearch ingestion pipelines.
Brute-force login signatures, privilege escalations, scheduled task creation, process spawning patterns.
Extracts local browser profiles (Chrome, Firefox, Edge, Safari) using direct SQLite database decoders to reconstruct timeline traces.
Attacker reconnaissance history, phishing access vectors, cache structures, Cached Credential SQLite tables.
Decrypts NTUSER.DAT, SYSTEM, SOFTWARE, SAM hives using Python python-registry modules to find persistent structures.
Persistence registries (Run/RunOnce keys), USB connection traces, shellbags, UserAssist executing timestamps.
Automates Volatility 3 command plugin analysis to parse raw physical RAM dumps, recovering transient and fileless malware traces.
Active process listings (pslist/pstree), network sockets (netscan), injected DLL modules (malfind).
Inspects live packet structures or raw PCAPs utilizing Wireshark/tshark pipelines and Python Scapy decoders.
Command & Control beacon timing anomalies, DNS tunneling channels, large outbound exfiltrations.
Performs file system integrity and MACB metadata scans using Autopsy pipelines and disk writing blockers.
Timestomping identification, files generated in %TEMP%/AppData, high-entropy packed directory segments.
Decodes Windows Prefetch (.pf) files, AppCompatCache (Shimcache), Amcache registries, and Linux audit logs.
Historical process executions, execution path mismatches, program signatures run prior to automated deletion.
Parses historical powershell scripts, transcript logs, and Linux bash/zsh shell histories.
Base64 encoded arguments, download cradles (IEX/Invoke-WebRequest), LOLBAS executions, mimikatz commands.
Queries system setupapi logs, udev properties, and Windows USBSTOR registry structures.
Removable drives, mounting serial numbers, timestamps, correlated file modifications in active windows.
AI Processing Core
The framework utilizes six custom analytical methods. Toggle between the underlying LaTeX mathematics and real Python script implementations.
Constructs probabilistic graphical models linking digital evidence — system logs, file modifications, network traffic — to investigation hypotheses. Calculates Likelihood Ratios (LR) quantifying the strength of evidence under prosecution vs. defense hypotheses. Integrates multi-source evidence and updates posterior threat scores in real-time.
Models normal network behavior as a mixture of K Gaussian components, each representing a legitimate traffic cluster (DNS queries, HTTP sessions, SSH tunnels). Data points falling into low-probability density regions — unusual packet sizes, abnormal connection intervals, or rogue port usage — are flagged as anomalies.
Converts forensic activity records into multidimensional vectors (representing parameters like process count, network connections, file access rate). Matches observed behavior vectors against known attack campaign vectors using Euclidean Distance metrics to identify campaign matches.
Binary classification model extracting features from PE headers (entropy, section count, import table size), API call sequences, and behavioral traces. Outputs calibrated probability scores for malicious classification. Explainable AI weights provide feature-level interpretability.
Analyzes the statistical randomness of files and memory segments by mapping byte distributions. Encrypted, compressed, or packed malware payloads exhibit high Shannon Entropy, allowing detection of ransomware file actions and packed packers in memory.
Aggregates all timestamps from normalized forensic data logs (logs, system modifications, network PCAPs) and applies additive time-series decomposition to isolate trend, seasonal, and residual components.
Framework Capabilities
Deep dive into the operational algorithms, scoring criteria, and threat taxonomies.
A unified anomaly calculation summarizing observed anomalies across hosts using weighted threat probabilities.
How detected system modifications map directly to standard MITRE Enterprise threat techniques.
| ATT&CK Tactic | Forensic Detection | Framework Action |
|---|---|---|
| Initial Access (TA0001) | Phishing URL found in browser SQLite history | Flag domain + query mail IP |
| Execution (TA0002) | PowerShell Base64 commands + YARA match | Kill PID + RAM dump Volatility |
| Persistence (TA0003) | New registry Run/RunOnce keys generated | Registry snapshot restore |
| Privilege Escalation (TA0004) | LSASS memory dump process patterns | Isolate process + trigger RAM lock |
| Lateral Movement (TA0008) | Atypical internal SMB/RDP socket sequences | Quarantine local gateway endpoint |
Framework Ecosystem
Industry-standard forensic suites integrated seamlessly with modern data engines and AI libraries.
Disk forensics & deleted file recovery
Memory forensics & RAM extraction
PCAP deep network protocol analyzer
Python automated packet parsing
Malware pattern matching rule engine
Real-time network intrusion IDS
Active host discovery & service mapping

LSTM deep learning anomaly detection
XGBoost, KNN, Isolation Forest tools
Log dataframe Normalizations & analytics
Mathematical calculations & entropy scales

Multi-source log indexing & fast search
Forensic dashboard timeline graphs
Case files & Browser DB parses

Isolated forensic sandbox pipelines
Forensic Portal REST APIs

Core pipeline execution script engine
Primary virtual forensic OS suite
7-Phase Stepper Workflow
The lifecycle of a digital forensic analysis mapped out phase-by-phase through our automated pipeline.
The framework is initialized using case parameters. The incident alert is evaluated (via SIEM logs, firewall events, or manual administrator trigger) to assess the scope of compromised systems, timestamp windows, and initial indicators.
Academic Context
The AI-DFIR framework stands on published research, integrating AI tools with strict forensic standards.
Rashmi Mandayam
Demonstrated that machine learning models and NLP workflows allow security analysts to parse enormous data volumes and compile threat timeline insights rapidly.
ICDF2C Best Paper Award
Proposes structural integration of large language models across 4 strategic stages: evidence discovery, pattern recognition, case evaluation, and court presentation.
DFIR Automation Review
Concludes that automated threat orchestration methods accelerate breach incident handling and lessen mean-time-to-respond (MTTR) by up to 90%.
Platform Vision
The expansion milestones planned to scale the AI-DFIR framework across automated operations.
Outcomes & Benchmarks
Quantified expected improvements comparing standard manual forensic methods against automated AI-DFIR pipelines.
Let's discuss the forensic framework, the ML models behind it, or how AI-driven investigation can be applied to your DFIR workflow. Open to research collaborations, speaking engagements, and consulting.