GenAI discovery at Techno West 2025: DFIR collection, artifacts, and authenticity workflows
Techno Security & Digital Forensics Conference West 2025 kicks off in San Diego on October 27–29 at the Town & Country Resort, with a strong emphasis on Generative/Agentic AI discovery and legal impacts (event announcement, program highlights). Legal-oriented sessions are explicitly tackling discovery for GenAI and agentic AI, including JAMS’ panel “Artificial Intelligence and Generative AI: Causes of Action and Defenses and Discovery” scheduled for Monday, October 27 at 3:15 p.m. (JAMS session page). Regional partners also underline the AI-heavy tracks (Cybersecurity, eDiscovery, Forensics, Investigations) running October 27–29 (CCOE event listing).
Courts and policy are still catching up to synthetic media and AI-assisted workflows, raising stakes for DFIR teams to capture the right evidence at the outset. Recent coverage and proceedings highlight gaps in courtroom readiness and rulemaking for deepfakes and AI-derived evidence (Axios, Ars Technica).
What DFIR teams should watch at Techno West
- eDiscovery/AI crossovers: discovery of model/agent artifacts, legal defenses, and proportional preservation for GenAI outputs (JAMS session).
- Operational AI topics embedded in investigations (e.g., AI in LE use, AI transparency case law) reflected on Monday’s program (program listing).
Evidence categories you will encounter in GenAI/Agentic investigations
- Local LLM/agent runtimes and caches (workstations, lab hosts)
- Ollama models and runtime:
- Default model paths: macOS
~/.ollama/models, Linux/usr/share/ollama/.ollama/models, WindowsC:\Users\%username%\.ollama\models(Ollama FAQ). - Default API bind:
127.0.0.1:11434(changeable viaOLLAMA_HOST) (Ollama FAQ).
- Default model paths: macOS
- LM Studio chats/models:
- Conversation JSONs: macOS/Linux
~/.lmstudio/conversations/, Windows%USERPROFILE%\.lmstudio\conversations(LM Studio docs). - LM Studio exposes OpenAI‑compatible endpoints locally (e.g.,
http://localhost:1234/v1) for integrations (LM Studio OpenAI‑compat API).
- Conversation JSONs: macOS/Linux
- Hugging Face caches commonly present when models/datasets are pulled:
- Hub cache default:
~/.cache/huggingface/hub; datasets:~/.cache/huggingface/datasets(configurable viaHF_HOME,HF_HUB_CACHE,HF_DATASETS_CACHE) (HF Hub cache guide, HF Datasets cache).
- Hub cache default:
- Common agent/dev frameworks leave local state:
- LangChain cache (SQLite):
.langchain.dbin working dir unless configured (LangChain SQLiteCache). - LlamaIndex persisting:
storage_context.persist()writes under a specifiedpersist_dir(default./storage) (LlamaIndex save/load, API refs).
- LangChain cache (SQLite):
- Cloud AI usage and enterprise logs
- Network indicators and API hosts seen in client logs and proxies:
- OpenAI API is commonly accessed via
https://api.openai.com/v1(see typical SDK defaults) (Open WebUI guide quoting default base URL; also typical client defaults listhttps://api.openai.com/v1asapi_base) (Kani engine docs). - Anthropic API endpoint
https://api.anthropic.comwithx-api-keyauthentication (Anthropic getting started). - Google Gemini API base
https://generativelanguage.googleapis.comwithx-goog-api-key(Gemini API ref, REST endpoint list).
- OpenAI API is commonly accessed via
- Enterprise discovery interfaces:
- OpenAI Compliance API for ChatGPT Enterprise workspaces (eDiscovery, DLP, SIEM export) (OpenAI Compliance API).
- OpenAI Admin and Audit Logs APIs for org-level auditability and retention controls (OpenAI Admin/Audit Logs help).
- ChatGPT macOS app stores chats/files in cloud per web policies (use the same retention behaviors) (OpenAI macOS app retention).
- AI‑generated media and provenance metadata
- C2PA/Content Credentials: open standard to embed signed provenance metadata, with conformance/trust‑list governance (C2PA explainer, C2PA conformance/trust list timeline).
- Verification options:
- Verify portal and trust lists used by the open‑source tooling (Content Credentials site, ITL/verify trust docs, c2patool usage).
- Browser validators/extensions (e.g., Digimarc extension built on C2PA‑JS) (Digimarc post, extension repo).
- Watermark detection at scale is imperfect; provenance markers are helpful but defeatable in the wild. Google’s SynthID detector portal is rolling out for early testers and claims billions of watermarked items already, but coverage is limited to content generated/edited by Google tools (Google SynthID Detector blog, TechCrunch coverage). Broader platform support remains uneven (The Verge on C2PA adoption limits).
Practical collection checklists and artifact paths
Use these as ready-to-run steps during triage or search execution.
A. Identify and collect local LLM and agent artifacts
- Ollama (models, manifests)
- Paths:
~/.ollama/models(macOS),/usr/share/ollama/.ollama/models(Linux),C:\Users\%username%\.ollama\models(Windows) (Ollama FAQ). - Runtime detection: process binding to 127.0.0.1:11434 by default (Ollama FAQ).
- Paths:
- LM Studio (chats)
- Conversations: macOS/Linux
~/.lmstudio/conversations/; Windows%USERPROFILE%\.lmstudio\conversations(LM Studio docs).
- Conversations: macOS/Linux
- Hugging Face caches (models/datasets pulled by many apps)
- Hub:
~/.cache/huggingface/hub(orHF_HUB_CACHE); Datasets:~/.cache/huggingface/datasets(orHF_DATASETS_CACHE) (HF Hub cache, HF Datasets cache).
- Hub:
- LangChain agent caches
.langchain.dbSQLite cache in working dir unless overridden (LangChain SQLiteCache).
- LlamaIndex indices/stores
- If
storage_context.persist()used, default is./storageunlesspersist_dirprovided (LlamaIndex save/load).
- If
Suggested triage commands:
# macOS/Linux: enumerate common GenAI artifacts for current user
ls -la ~/.ollama/models ~/.lmstudio/conversations ~/.cache/huggingface/hub ~/.cache/huggingface/datasets 2>/dev/null
# Windows PowerShell (run as user):
Get-ChildItem "$env:USERPROFILE\.ollama\models" -Recurse -ErrorAction SilentlyContinue
Get-ChildItem "$env:USERPROFILE\.lmstudio\conversations" -Recurse -ErrorAction SilentlyContinue
Get-ChildItem "$env:USERPROFILE\.cache\huggingface" -Recurse -ErrorAction SilentlyContinue
# Find likely local AI servers (LM Studio ~1234, Ollama 11434)
# macOS/Linux:
sudo lsof -iTCP -sTCP:LISTEN | egrep ':11434|:1234'
# Windows (PowerShell):
Get-NetTCPConnection -LocalPort 11434,1234 -State Listen | Ft -AutoSize
YARA indicator for GGUF model files (commonly used by llama.cpp/Ollama):
rule GGUF_Model_File {
meta:
description = "Detects GGUF model files by magic bytes"
reference = "GGUF spec header magic per llama.cpp"
strings:
$gguf = {47 47 55 46} // ASCII 'GGUF'
condition:
uint32(0) == 0x46554747 or $gguf at 0
}
This relies on the documented GGUF magic header “GGUF” at file start (llama.cpp gguf.h).
B. Preserve cloud usage and org logs
- Request exports via enterprise interfaces where available:
- ChatGPT Enterprise: Compliance API (conversation/file logs to eDiscovery/DLP/SIEM) (OpenAI Compliance API).
- Org-level auditability (Admin + Audit Logs APIs) (OpenAI Admin/Audit help).
- Network telemetry: capture DNS/HTTPS metadata for
api.openai.com,api.anthropic.com, andgenerativelanguage.googleapis.comto corroborate usage (Anthropic docs, Gemini API). Typical OpenAI client defaults targethttps://api.openai.com/v1(Open WebUI guide, Kani client default).
C. Verify AI-generated media and provenance
- Always acquire originals (no re-encodes). Compute cryptographic hashes on intake.
- Check for Content Credentials/C2PA:
- Web: upload to the official Verify site (linked from the Content Credentials program) (Content Credentials).
- CLI:
c2patool <file> trustwith the Verify trust anchors/allowed lists (c2patool usage, ITL details).
- Test for watermarks when applicable:
- Google’s SynthID Detector portal for media created with Google models (early access) (Google blog). Adoption and platform enforcement vary; treat watermarking as one signal, not dispositive (The Verge overview).
Detection ideas (SOC/IR rules of thumb)
- Host detections
- Process listening on 127.0.0.1:11434 with concurrent I/O to
~/.ollama/models→ flag local LLM runtime (Ollama) (Ollama FAQ). - File creations >1–20 GB in
~/.ollama/models/blobsor%USERPROFILE%\.ollama\models\blobsover short intervals → large model pulls (Ollama FAQ). - Frequent writes to
.langchain.dbor./storage/in project folders → active agent pipelines (LangChain/LlamaIndex) (LangChain SQLiteCache, LlamaIndex persistence). - Presence of
.lmstudio/conversations/*.jsonsteadily increasing → local chat usage (LM Studio docs).
- Process listening on 127.0.0.1:11434 with concurrent I/O to
- Network detections
- Egress to
api.openai.com/api.anthropic.com/generativelanguage.googleapis.comwith POSTs of JSON payloads → API usage corroboration (Anthropic, Gemini). - Local web traffic to
http://localhost:1234/v1orhttp://127.0.0.1:11434/apifrom desktop apps → local model proxies (LM Studio, Ollama) (LM Studio OpenAI‑compat API, Ollama FAQ).
- Egress to
eDiscovery and legal handling notes
- Expect courts to scrutinize authenticity for AI media; guidance is evolving. Panels have debated but not finalized new Rules changes; judges have questioned whether existing authentication rules suffice, at least for now (Ars Technica). Practical readiness concerns persist (Axios).
- For chat/agent evidence, capture:
- Prompt history (system/user), tool calls/actions, files referenced, model ID/version/quantization, parameters (temperature/top‑p), plugins/extensions used.
- Workspace and org logs where available (e.g., OpenAI Compliance API; Admin/Audit Logs) (OpenAI Compliance API, Admin/Audit help).
- For AI‑generated images/audio/video, include C2PA verification output in reports and note verifier trust lists and statuses (C2PA explainer, Verify/ITL).
Sample response playbook (first 24–48 hours)
- Scoping and containment
- Identify hosts with local LLMs/agents (ports 11434/1234; artifact paths) (Ollama FAQ, LM Studio API).
- Lock down workspaces and export enterprise AI logs (OpenAI Compliance/Admin/Audit where applicable) (OpenAI Compliance API, Admin/Audit help).
- Forensic acquisition
- Image systems or targeted collections of:
~/.ollama/models(and manifest files),.lmstudio/conversations, HF caches,.langchain.db, LlamaIndex./storage(Ollama FAQ, LM Studio, HF caches, LangChain, LlamaIndex).
- Preserve AI media in original form. Compute hashes, then verify provenance using C2PA tools/Verify; run watermark checks when applicable (c2patool usage, Content Credentials, Google SynthID Detector).
- Analysis and reporting
- Correlate local artifacts with org logs and network telemetry (hosts
api.openai.com,api.anthropic.com,generativelanguage.googleapis.com) (Anthropic, Gemini). - Document chain of prompts, tools, and model settings; include verifier outputs and trust-list references for any Content Credentials (C2PA explainer, ITL).
Quick reference snippets
Enumerate and hash LM Studio chats (macOS/Linux):
find ~/.lmstudio/conversations -type f -name '*.json' -print0 | xargs -0 shasum -a 256
Verify C2PA manifest with trusted anchors:
export C2PATOOL_TRUST_ANCHORS='https://contentcredentials.org/trust/anchors.pem'
export C2PATOOL_ALLOWED_LIST='https://contentcredentials.org/trust/allowed.sha256.txt'
export C2PATOOL_TRUST_CONFIG='https://contentcredentials.org/trust/store.cfg'
c2patool suspect.jpg trust
List local Ollama models and inspect on-disk size:
ollama list
du -sh ~/.ollama/models 2>/dev/null || sudo du -sh /usr/share/ollama/.ollama/models
(Ollama FAQ).
Research signal: detection is an arms race
Academic and government R&D emphasize evolving, adaptive detection and provenance rather than static fingerprints alone (Azizpour et al., 2025, DARPA SemaFor transition notes). Reviews also warn about adversarial fragility of many detectors (Khan et al., 2025, AFSL robustness paper). Treat any single method (including watermarks) as probabilistic signal, not proof (The Verge on adoption gaps, Google SynthID Detector blog).
Takeaways
- Add local LLM/agent paths to standard triage: Ollama models, LM Studio chats, HF caches,
.langchain.db, and LlamaIndex./storage. - Monitor for local AI servers on 11434 (Ollama) and 1234 (LM Studio) and for cloud hosts
api.openai.com,api.anthropic.com, andgenerativelanguage.googleapis.com. - Use enterprise APIs to export AI usage logs (OpenAI Compliance/Admin/Audit) early in an investigation.
- Validate media provenance with C2PA tools/Verify, and treat watermarks as one signal among many.
- For reports, capture prompts, tools, model versions/quantization, and verifier trust context to support courtroom scrutiny.
Sources / References
- Techno Security West 2025 landing: https://www.technosecurity.us/west/
- Techno Security West 2025 program: https://www.technosecurity.us/west/conference-program/2025-conference-program
- JAMS Techno Security session (AI defenses & discovery): https://www.jamsadr.com/events/2025/techno-security-digital-forensics-conference
- Cyber Center of Excellence listing (Techno West 2025): https://sdccoe.org/event/techno-west-2025-conference-program/
- Ollama FAQ (paths, binding, port 11434): https://docs.ollama.com/faq
- LM Studio docs – Manage chats (paths): https://lmstudio.ai/docs/app/basics/chat
- LM Studio OpenAI-compatible API: https://lmstudio.ai/docs/app/api/endpoints/openai/
- Hugging Face – Hub cache management: https://huggingface.co/docs/huggingface_hub/en/guides/manage-cache
- Hugging Face – Datasets cache: https://huggingface.co/docs/datasets/main/cache
- LangChain SQLiteCache: https://api.python.langchain.com/en/latest/community/cache/langchain_community.cache.SQLiteCache.html
- LlamaIndex – Persisting & loading data: https://docs.llamaindex.ai/en/stable/module_guides/storing/save_load/
- LlamaIndex – StorageContext API: https://docs.llamaindex.ai/en/stable/api_reference/storage/storage_context/
- Anthropic API – Getting started: https://docs.anthropic.com/en/api/getting-started
- Google Gemini API reference: https://ai.google.dev/api
- Google Gemini REST endpoint list: https://ai.google.dev/api/rest/generativelanguage
- Open WebUI – default OpenAI base URL note: https://docs.openwebui.com/getting-started/quick-start/starting-with-openai/
- Kani (OpenAI engine) – default api_base reference: https://kani.readthedocs.io/en/latest/engines/openai.html
- OpenAI – Compliance API (Enterprise): https://help.openai.com/en/articles/9261474-compliance-api-%20forenterprise-customers
- OpenAI – Admin and Audit Logs API (help): https://help.openai.com/en/articles/9687866-admin-and-audit-logs-api-for-the-api-platform%23.eot
- OpenAI – macOS app data retention: https://help.openai.com/en/articles/9268871-how-is-data-retained-in-the-macos-app
- C2PA explainer (specification): https://c2pa.org/specifications/specifications/2.2/explainer/Explainer.html
- C2PA Conformance and Trust List timeline: https://c2pa.org/conformance/
- Content Credentials (program site): https://contentcredentials.org/
- Verify site trust list docs (ITL): https://opensource.contentauthenticity.org/docs/verify-known-cert-list/
- c2patool usage (with trust anchors): https://opensource.contentauthenticity.org/docs/c2patool/docs/usage/
- Digimarc Chrome extension (blog): https://www.digimarc.com/blog/validate-content-credentials-your-browser-digimarc-c2pa-content-credentials-extension
- Digimarc C2PA Chrome extension (GitHub): https://github.com/digimarc-corp/c2pa-content-credentials-extension
- Google SynthID Detector – official blog: https://blog.google/technology/ai/google-synthid-ai-content-detector/
- TechCrunch – SynthID usage stat coverage: https://techcrunch.com/snippet/3009804/google-says-synth-id-has-been-used-to-watermark-over-10-billion-pieces-of-content/
- The Verge – C2PA adoption challenges: https://www.theverge.com/2024/8/21/24223932/c2pa-standard-verify-ai-generated-images-content-credentials
- Axios – Courts aren’t ready for AI-generated evidence: https://www.axios.com/2025/07/25/courts-deepfakes-ai-trial-evidence
- Ars Technica – Judicial panel debates AI evidence rules: https://arstechnica.com/information-technology/2024/04/deepfakes-in-the-courtroom-us-judicial-panel-debates-new-ai-evidence-rules/
- Azizpour et al. 2025 – Self‑adapting synthetic media detection: https://arxiv.org/abs/2504.03615
- DARPA – Furthering deepfake defenses (SemaFor transition): https://www.darpa.mil/news/2025/furthering-deepfake-defenses
- Khan et al. 2025 – Review of adversarially robust deepfake detection: https://arxiv.org/abs/2507.21157
- AFSL adversarial robustness paper (2024): https://arxiv.org/abs/2403.08806
- GGUF magic header reference (llama.cpp gguf.h): https://fossies.org/linux/llama.cpp/ggml/include/gguf.h