PlanOpticon
Single Video Analysis
Single Video Analysis
Basic usage
planopticon analyze -i recording.mp4 -o ./output
What happens
The pipeline runs these steps in order:
- Frame extraction -- Samples frames using change detection for transitions plus periodic capture (every 30s) for slow-evolving content like document scrolling
- People frame filtering -- OpenCV face detection automatically removes webcam/video conference frames, keeping only shared content (slides, documents, screen shares)
- Audio extraction -- Extracts audio track to WAV
- Transcription -- Sends audio to speech-to-text (Whisper or Gemini). If
--speakersis provided, speaker diarization hints are passed to the provider. - Diagram detection -- Vision model classifies each frame as diagram/chart/whiteboard/screenshot/none
- Diagram analysis -- High-confidence diagrams get full extraction (description, text, mermaid, chart data)
- Screengrab fallback -- Medium-confidence frames are saved as captioned screenshots
- Knowledge graph -- Extracts entities and relationships from transcript + diagrams, stored in both
knowledge_graph.db(SQLite, primary) andknowledge_graph.json(export) - Key points -- LLM extracts main points and topics
- Action items -- LLM finds tasks, commitments, and follow-ups
- Reports -- Generates markdown, HTML, and PDF
- Export -- Renders mermaid diagrams to SVG/PNG, reproduces charts
After analysis, you can optionally run planning taxonomy classification on the knowledge graph to categorize entities for use with the planning agent:
planopticon kg classify results/knowledge_graph.db
Processing depth
basic
- Transcription only
- Key points and action items
- No diagram extraction
standard (default)
- Everything in basic
- Diagram extraction (up to 10 frames, evenly sampled)
- Knowledge graph
- Full report generation
comprehensive
- Everything in standard
- More frames analyzed (up to 20)
- Deeper analysis
Command-line options
Provider and model selection
# Use a specific provider
planopticon analyze -i video.mp4 -o ./output --provider anthropic
# Override vision and chat models separately
planopticon analyze -i video.mp4 -o ./output --vision-model gpt-4o --chat-model claude-sonnet-4-20250514
Speaker diarization hints
Use --speakers to provide speaker names as comma-separated hints. These are passed to the transcription provider to improve speaker identification in the transcript segments.
planopticon analyze -i video.mp4 -o ./output --speakers "Alice,Bob,Carol"
Custom prompt templates
Use --templates-dir to point to a directory of custom .txt prompt template files. These override the built-in prompts used for diagram analysis, key point extraction, action item extraction, and other LLM-driven steps.
planopticon analyze -i video.mp4 -o ./output --templates-dir ./my-prompts
Template files should be named to match the built-in template names (e.g., key_points.txt, action_items.txt). See the video_processor/utils/prompt_templates.py module for the full list of template names.
Output format
Use --output-format json to emit the complete VideoManifest as structured JSON to stdout, in addition to writing all output files to disk. This is useful for scripting, CI/CD integration, or piping results into other tools.
# Standard output (files + console summary)
planopticon analyze -i video.mp4 -o ./output
# JSON manifest to stdout
planopticon analyze -i video.mp4 -o ./output --output-format json
Frame extraction tuning
# Adjust sampling rate (frames per second to consider)
planopticon analyze -i video.mp4 -o ./output --sampling-rate 1.0
# Adjust change detection threshold (lower = more sensitive)
planopticon analyze -i video.mp4 -o ./output --change-threshold 0.10
# Adjust periodic capture interval
planopticon analyze -i video.mp4 -o ./output --periodic-capture 60
# Enable GPU acceleration for frame extraction
planopticon analyze -i video.mp4 -o ./output --use-gpu
Output structure
Every run produces a standardized directory structure:
output/
├── manifest.json # Run manifest (source of truth)
├── transcript/
│ ├── transcript.json # Full transcript with segments + speakers
│ ├── transcript.txt # Plain text
│ └── transcript.srt # Subtitles
├── frames/
│ ├── frame_0000.jpg
│ └── ...
├── diagrams/
│ ├── diagram_0.jpg # Original frame
│ ├── diagram_0.mermaid # Mermaid source
│ ├── diagram_0.svg # Vector rendering
│ ├── diagram_0.png # Raster rendering
│ ├── diagram_0.json # Analysis data
│ └── ...
├── captures/
│ ├── capture_0.jpg # Medium-confidence screenshots
│ ├── capture_0.json
│ └── ...
└── results/
├── analysis.md # Markdown report
├── analysis.html # HTML report
├── analysis.pdf # PDF (if planopticon[pdf] installed)
├── knowledge_graph.db # Knowledge graph (SQLite, primary)
├── knowledge_graph.json # Knowledge graph (JSON export)
├── key_points.json # Extracted key points
└── action_items.json # Action items
Output manifest
Every run produces a manifest.json that is the single source of truth:
{
"version": "1.0",
"video": {
"title": "Analysis of recording",
"source_path": "/path/to/recording.mp4",
"duration_seconds": 3600.0
},
"stats": {
"duration_seconds": 45.2,
"frames_extracted": 42,
"people_frames_filtered": 11,
"diagrams_detected": 3,
"screen_captures": 5,
"models_used": {
"vision": "gpt-4o",
"chat": "gpt-4o"
}
},
"transcript_json": "transcript/transcript.json",
"transcript_txt": "transcript/transcript.txt",
"transcript_srt": "transcript/transcript.srt",
"analysis_md": "results/analysis.md",
"knowledge_graph_json": "results/knowledge_graph.json",
"knowledge_graph_db": "results/knowledge_graph.db",
"key_points_json": "results/key_points.json",
"action_items_json": "results/action_items.json",
"key_points": [...],
"action_items": [...],
"diagrams": [...],
"screen_captures": [...]
}
Checkpoint and resume
The pipeline supports checkpoint/resume. If a step's output files already exist on disk, that step is skipped on re-run. This means you can safely re-run an interrupted analysis and it will pick up where it left off:
# First run (interrupted at step 6)
planopticon analyze -i video.mp4 -o ./output
# Second run (resumes from step 6)
planopticon analyze -i video.mp4 -o ./output
Using results after analysis
Query the knowledge graph
After analysis completes, you can query the knowledge graph directly:
# Show graph stats
planopticon query --db results/knowledge_graph.db
# List entities by type
planopticon query --db results/knowledge_graph.db "entities --type technology"
# Find neighbors of an entity
planopticon query --db results/knowledge_graph.db "neighbors Kubernetes"
# Ask natural language questions (requires API key)
planopticon query --db results/knowledge_graph.db "What technologies were discussed?"
Classify entities for planning
Run taxonomy classification to categorize entities into planning types (goal, milestone, risk, dependency, etc.):
planopticon kg classify results/knowledge_graph.db
planopticon kg classify results/knowledge_graph.db --format json
Export to other formats
# Generate markdown documents
planopticon export markdown results/knowledge_graph.db -o ./docs
# Export as Obsidian vault
planopticon export obsidian results/knowledge_graph.db -o ./vault
# Export as PlanOpticonExchange
planopticon export exchange results/knowledge_graph.db -o exchange.json
# Generate GitHub wiki
planopticon wiki generate results/knowledge_graph.db -o ./wiki
Use with the planning agent
The planning agent can consume the knowledge graph to generate project plans, PRDs, roadmaps, and other planning artifacts:
planopticon agent --db results/knowledge_graph.db
Z 10e489ed8b26cb98662d82e863397544