|
f0106a3…
|
leo
|
1 |
# Single Video Analysis |
|
f0106a3…
|
leo
|
2 |
|
|
f0106a3…
|
leo
|
3 |
## Basic usage |
|
f0106a3…
|
leo
|
4 |
|
|
f0106a3…
|
leo
|
5 |
```bash |
|
f0106a3…
|
leo
|
6 |
planopticon analyze -i recording.mp4 -o ./output |
|
f0106a3…
|
leo
|
7 |
``` |
|
f0106a3…
|
leo
|
8 |
|
|
f0106a3…
|
leo
|
9 |
## What happens |
|
f0106a3…
|
leo
|
10 |
|
|
f0106a3…
|
leo
|
11 |
The pipeline runs these steps in order: |
|
f0106a3…
|
leo
|
12 |
|
|
3551b80…
|
noreply
|
13 |
1. **Frame extraction** -- Samples frames using change detection for transitions plus periodic capture (every 30s) for slow-evolving content like document scrolling |
|
3551b80…
|
noreply
|
14 |
2. **People frame filtering** -- OpenCV face detection automatically removes webcam/video conference frames, keeping only shared content (slides, documents, screen shares) |
|
3551b80…
|
noreply
|
15 |
3. **Audio extraction** -- Extracts audio track to WAV |
|
3551b80…
|
noreply
|
16 |
4. **Transcription** -- Sends audio to speech-to-text (Whisper or Gemini). If `--speakers` is provided, speaker diarization hints are passed to the provider. |
|
3551b80…
|
noreply
|
17 |
5. **Diagram detection** -- Vision model classifies each frame as diagram/chart/whiteboard/screenshot/none |
|
3551b80…
|
noreply
|
18 |
6. **Diagram analysis** -- High-confidence diagrams get full extraction (description, text, mermaid, chart data) |
|
3551b80…
|
noreply
|
19 |
7. **Screengrab fallback** -- Medium-confidence frames are saved as captioned screenshots |
|
3551b80…
|
noreply
|
20 |
8. **Knowledge graph** -- Extracts entities and relationships from transcript + diagrams, stored in both `knowledge_graph.db` (SQLite, primary) and `knowledge_graph.json` (export) |
|
3551b80…
|
noreply
|
21 |
9. **Key points** -- LLM extracts main points and topics |
|
3551b80…
|
noreply
|
22 |
10. **Action items** -- LLM finds tasks, commitments, and follow-ups |
|
3551b80…
|
noreply
|
23 |
11. **Reports** -- Generates markdown, HTML, and PDF |
|
3551b80…
|
noreply
|
24 |
12. **Export** -- Renders mermaid diagrams to SVG/PNG, reproduces charts |
|
3551b80…
|
noreply
|
25 |
|
|
3551b80…
|
noreply
|
26 |
After analysis, you can optionally run planning taxonomy classification on the knowledge graph to categorize entities for use with the planning agent: |
|
3551b80…
|
noreply
|
27 |
|
|
3551b80…
|
noreply
|
28 |
```bash |
|
3551b80…
|
noreply
|
29 |
planopticon kg classify results/knowledge_graph.db |
|
3551b80…
|
noreply
|
30 |
``` |
|
f0106a3…
|
leo
|
31 |
|
|
f0106a3…
|
leo
|
32 |
## Processing depth |
|
f0106a3…
|
leo
|
33 |
|
|
f0106a3…
|
leo
|
34 |
### `basic` |
|
f0106a3…
|
leo
|
35 |
- Transcription only |
|
f0106a3…
|
leo
|
36 |
- Key points and action items |
|
f0106a3…
|
leo
|
37 |
- No diagram extraction |
|
f0106a3…
|
leo
|
38 |
|
|
f0106a3…
|
leo
|
39 |
### `standard` (default) |
|
f0106a3…
|
leo
|
40 |
- Everything in basic |
|
3551b80…
|
noreply
|
41 |
- Diagram extraction (up to 10 frames, evenly sampled) |
|
f0106a3…
|
leo
|
42 |
- Knowledge graph |
|
f0106a3…
|
leo
|
43 |
- Full report generation |
|
f0106a3…
|
leo
|
44 |
|
|
f0106a3…
|
leo
|
45 |
### `comprehensive` |
|
f0106a3…
|
leo
|
46 |
- Everything in standard |
|
f0106a3…
|
leo
|
47 |
- More frames analyzed (up to 20) |
|
f0106a3…
|
leo
|
48 |
- Deeper analysis |
|
3551b80…
|
noreply
|
49 |
|
|
3551b80…
|
noreply
|
50 |
## Command-line options |
|
3551b80…
|
noreply
|
51 |
|
|
3551b80…
|
noreply
|
52 |
### Provider and model selection |
|
3551b80…
|
noreply
|
53 |
|
|
3551b80…
|
noreply
|
54 |
```bash |
|
3551b80…
|
noreply
|
55 |
# Use a specific provider |
|
3551b80…
|
noreply
|
56 |
planopticon analyze -i video.mp4 -o ./output --provider anthropic |
|
3551b80…
|
noreply
|
57 |
|
|
3551b80…
|
noreply
|
58 |
# Override vision and chat models separately |
|
3551b80…
|
noreply
|
59 |
planopticon analyze -i video.mp4 -o ./output --vision-model gpt-4o --chat-model claude-sonnet-4-20250514 |
|
3551b80…
|
noreply
|
60 |
``` |
|
3551b80…
|
noreply
|
61 |
|
|
3551b80…
|
noreply
|
62 |
### Speaker diarization hints |
|
3551b80…
|
noreply
|
63 |
|
|
3551b80…
|
noreply
|
64 |
Use `--speakers` to provide speaker names as comma-separated hints. These are passed to the transcription provider to improve speaker identification in the transcript segments. |
|
3551b80…
|
noreply
|
65 |
|
|
3551b80…
|
noreply
|
66 |
```bash |
|
3551b80…
|
noreply
|
67 |
planopticon analyze -i video.mp4 -o ./output --speakers "Alice,Bob,Carol" |
|
3551b80…
|
noreply
|
68 |
``` |
|
3551b80…
|
noreply
|
69 |
|
|
3551b80…
|
noreply
|
70 |
### Custom prompt templates |
|
3551b80…
|
noreply
|
71 |
|
|
3551b80…
|
noreply
|
72 |
Use `--templates-dir` to point to a directory of custom `.txt` prompt template files. These override the built-in prompts used for diagram analysis, key point extraction, action item extraction, and other LLM-driven steps. |
|
3551b80…
|
noreply
|
73 |
|
|
3551b80…
|
noreply
|
74 |
```bash |
|
3551b80…
|
noreply
|
75 |
planopticon analyze -i video.mp4 -o ./output --templates-dir ./my-prompts |
|
3551b80…
|
noreply
|
76 |
``` |
|
3551b80…
|
noreply
|
77 |
|
|
3551b80…
|
noreply
|
78 |
Template files should be named to match the built-in template names (e.g., `key_points.txt`, `action_items.txt`). See the `video_processor/utils/prompt_templates.py` module for the full list of template names. |
|
3551b80…
|
noreply
|
79 |
|
|
3551b80…
|
noreply
|
80 |
### Output format |
|
3551b80…
|
noreply
|
81 |
|
|
3551b80…
|
noreply
|
82 |
Use `--output-format json` to emit the complete `VideoManifest` as structured JSON to stdout, in addition to writing all output files to disk. This is useful for scripting, CI/CD integration, or piping results into other tools. |
|
3551b80…
|
noreply
|
83 |
|
|
3551b80…
|
noreply
|
84 |
```bash |
|
3551b80…
|
noreply
|
85 |
# Standard output (files + console summary) |
|
3551b80…
|
noreply
|
86 |
planopticon analyze -i video.mp4 -o ./output |
|
3551b80…
|
noreply
|
87 |
|
|
3551b80…
|
noreply
|
88 |
# JSON manifest to stdout |
|
3551b80…
|
noreply
|
89 |
planopticon analyze -i video.mp4 -o ./output --output-format json |
|
3551b80…
|
noreply
|
90 |
``` |
|
3551b80…
|
noreply
|
91 |
|
|
3551b80…
|
noreply
|
92 |
### Frame extraction tuning |
|
3551b80…
|
noreply
|
93 |
|
|
3551b80…
|
noreply
|
94 |
```bash |
|
3551b80…
|
noreply
|
95 |
# Adjust sampling rate (frames per second to consider) |
|
3551b80…
|
noreply
|
96 |
planopticon analyze -i video.mp4 -o ./output --sampling-rate 1.0 |
|
3551b80…
|
noreply
|
97 |
|
|
3551b80…
|
noreply
|
98 |
# Adjust change detection threshold (lower = more sensitive) |
|
3551b80…
|
noreply
|
99 |
planopticon analyze -i video.mp4 -o ./output --change-threshold 0.10 |
|
3551b80…
|
noreply
|
100 |
|
|
3551b80…
|
noreply
|
101 |
# Adjust periodic capture interval |
|
3551b80…
|
noreply
|
102 |
planopticon analyze -i video.mp4 -o ./output --periodic-capture 60 |
|
3551b80…
|
noreply
|
103 |
|
|
3551b80…
|
noreply
|
104 |
# Enable GPU acceleration for frame extraction |
|
3551b80…
|
noreply
|
105 |
planopticon analyze -i video.mp4 -o ./output --use-gpu |
|
3551b80…
|
noreply
|
106 |
``` |
|
3551b80…
|
noreply
|
107 |
|
|
3551b80…
|
noreply
|
108 |
## Output structure |
|
3551b80…
|
noreply
|
109 |
|
|
3551b80…
|
noreply
|
110 |
Every run produces a standardized directory structure: |
|
3551b80…
|
noreply
|
111 |
|
|
3551b80…
|
noreply
|
112 |
``` |
|
3551b80…
|
noreply
|
113 |
output/ |
|
3551b80…
|
noreply
|
114 |
├── manifest.json # Run manifest (source of truth) |
|
3551b80…
|
noreply
|
115 |
├── transcript/ |
|
3551b80…
|
noreply
|
116 |
│ ├── transcript.json # Full transcript with segments + speakers |
|
3551b80…
|
noreply
|
117 |
│ ├── transcript.txt # Plain text |
|
3551b80…
|
noreply
|
118 |
│ └── transcript.srt # Subtitles |
|
3551b80…
|
noreply
|
119 |
├── frames/ |
|
3551b80…
|
noreply
|
120 |
│ ├── frame_0000.jpg |
|
3551b80…
|
noreply
|
121 |
│ └── ... |
|
3551b80…
|
noreply
|
122 |
├── diagrams/ |
|
3551b80…
|
noreply
|
123 |
│ ├── diagram_0.jpg # Original frame |
|
3551b80…
|
noreply
|
124 |
│ ├── diagram_0.mermaid # Mermaid source |
|
3551b80…
|
noreply
|
125 |
│ ├── diagram_0.svg # Vector rendering |
|
3551b80…
|
noreply
|
126 |
│ ├── diagram_0.png # Raster rendering |
|
3551b80…
|
noreply
|
127 |
│ ├── diagram_0.json # Analysis data |
|
3551b80…
|
noreply
|
128 |
│ └── ... |
|
3551b80…
|
noreply
|
129 |
├── captures/ |
|
3551b80…
|
noreply
|
130 |
│ ├── capture_0.jpg # Medium-confidence screenshots |
|
3551b80…
|
noreply
|
131 |
│ ├── capture_0.json |
|
3551b80…
|
noreply
|
132 |
│ └── ... |
|
3551b80…
|
noreply
|
133 |
└── results/ |
|
3551b80…
|
noreply
|
134 |
├── analysis.md # Markdown report |
|
3551b80…
|
noreply
|
135 |
├── analysis.html # HTML report |
|
3551b80…
|
noreply
|
136 |
├── analysis.pdf # PDF (if planopticon[pdf] installed) |
|
3551b80…
|
noreply
|
137 |
├── knowledge_graph.db # Knowledge graph (SQLite, primary) |
|
3551b80…
|
noreply
|
138 |
├── knowledge_graph.json # Knowledge graph (JSON export) |
|
3551b80…
|
noreply
|
139 |
├── key_points.json # Extracted key points |
|
3551b80…
|
noreply
|
140 |
└── action_items.json # Action items |
|
3551b80…
|
noreply
|
141 |
``` |
|
f0106a3…
|
leo
|
142 |
|
|
f0106a3…
|
leo
|
143 |
## Output manifest |
|
f0106a3…
|
leo
|
144 |
|
|
f0106a3…
|
leo
|
145 |
Every run produces a `manifest.json` that is the single source of truth: |
|
f0106a3…
|
leo
|
146 |
|
|
f0106a3…
|
leo
|
147 |
```json |
|
f0106a3…
|
leo
|
148 |
{ |
|
f0106a3…
|
leo
|
149 |
"version": "1.0", |
|
f0106a3…
|
leo
|
150 |
"video": { |
|
f0106a3…
|
leo
|
151 |
"title": "Analysis of recording", |
|
f0106a3…
|
leo
|
152 |
"source_path": "/path/to/recording.mp4", |
|
f0106a3…
|
leo
|
153 |
"duration_seconds": 3600.0 |
|
f0106a3…
|
leo
|
154 |
}, |
|
f0106a3…
|
leo
|
155 |
"stats": { |
|
f0106a3…
|
leo
|
156 |
"duration_seconds": 45.2, |
|
f0106a3…
|
leo
|
157 |
"frames_extracted": 42, |
|
287a3bb…
|
leo
|
158 |
"people_frames_filtered": 11, |
|
f0106a3…
|
leo
|
159 |
"diagrams_detected": 3, |
|
3551b80…
|
noreply
|
160 |
"screen_captures": 5, |
|
3551b80…
|
noreply
|
161 |
"models_used": { |
|
3551b80…
|
noreply
|
162 |
"vision": "gpt-4o", |
|
3551b80…
|
noreply
|
163 |
"chat": "gpt-4o" |
|
3551b80…
|
noreply
|
164 |
} |
|
f0106a3…
|
leo
|
165 |
}, |
|
3551b80…
|
noreply
|
166 |
"transcript_json": "transcript/transcript.json", |
|
3551b80…
|
noreply
|
167 |
"transcript_txt": "transcript/transcript.txt", |
|
3551b80…
|
noreply
|
168 |
"transcript_srt": "transcript/transcript.srt", |
|
3551b80…
|
noreply
|
169 |
"analysis_md": "results/analysis.md", |
|
3551b80…
|
noreply
|
170 |
"knowledge_graph_json": "results/knowledge_graph.json", |
|
3551b80…
|
noreply
|
171 |
"knowledge_graph_db": "results/knowledge_graph.db", |
|
3551b80…
|
noreply
|
172 |
"key_points_json": "results/key_points.json", |
|
3551b80…
|
noreply
|
173 |
"action_items_json": "results/action_items.json", |
|
f0106a3…
|
leo
|
174 |
"key_points": [...], |
|
f0106a3…
|
leo
|
175 |
"action_items": [...], |
|
f0106a3…
|
leo
|
176 |
"diagrams": [...], |
|
f0106a3…
|
leo
|
177 |
"screen_captures": [...] |
|
f0106a3…
|
leo
|
178 |
} |
|
3551b80…
|
noreply
|
179 |
``` |
|
3551b80…
|
noreply
|
180 |
|
|
3551b80…
|
noreply
|
181 |
## Checkpoint and resume |
|
3551b80…
|
noreply
|
182 |
|
|
3551b80…
|
noreply
|
183 |
The pipeline supports checkpoint/resume. If a step's output files already exist on disk, that step is skipped on re-run. This means you can safely re-run an interrupted analysis and it will pick up where it left off: |
|
3551b80…
|
noreply
|
184 |
|
|
3551b80…
|
noreply
|
185 |
```bash |
|
3551b80…
|
noreply
|
186 |
# First run (interrupted at step 6) |
|
3551b80…
|
noreply
|
187 |
planopticon analyze -i video.mp4 -o ./output |
|
3551b80…
|
noreply
|
188 |
|
|
3551b80…
|
noreply
|
189 |
# Second run (resumes from step 6) |
|
3551b80…
|
noreply
|
190 |
planopticon analyze -i video.mp4 -o ./output |
|
3551b80…
|
noreply
|
191 |
``` |
|
3551b80…
|
noreply
|
192 |
|
|
3551b80…
|
noreply
|
193 |
## Using results after analysis |
|
3551b80…
|
noreply
|
194 |
|
|
3551b80…
|
noreply
|
195 |
### Query the knowledge graph |
|
3551b80…
|
noreply
|
196 |
|
|
3551b80…
|
noreply
|
197 |
After analysis completes, you can query the knowledge graph directly: |
|
3551b80…
|
noreply
|
198 |
|
|
3551b80…
|
noreply
|
199 |
```bash |
|
3551b80…
|
noreply
|
200 |
# Show graph stats |
|
3551b80…
|
noreply
|
201 |
planopticon query --db results/knowledge_graph.db |
|
3551b80…
|
noreply
|
202 |
|
|
3551b80…
|
noreply
|
203 |
# List entities by type |
|
3551b80…
|
noreply
|
204 |
planopticon query --db results/knowledge_graph.db "entities --type technology" |
|
3551b80…
|
noreply
|
205 |
|
|
3551b80…
|
noreply
|
206 |
# Find neighbors of an entity |
|
3551b80…
|
noreply
|
207 |
planopticon query --db results/knowledge_graph.db "neighbors Kubernetes" |
|
3551b80…
|
noreply
|
208 |
|
|
3551b80…
|
noreply
|
209 |
# Ask natural language questions (requires API key) |
|
3551b80…
|
noreply
|
210 |
planopticon query --db results/knowledge_graph.db "What technologies were discussed?" |
|
3551b80…
|
noreply
|
211 |
``` |
|
3551b80…
|
noreply
|
212 |
|
|
3551b80…
|
noreply
|
213 |
### Classify entities for planning |
|
3551b80…
|
noreply
|
214 |
|
|
3551b80…
|
noreply
|
215 |
Run taxonomy classification to categorize entities into planning types (goal, milestone, risk, dependency, etc.): |
|
3551b80…
|
noreply
|
216 |
|
|
3551b80…
|
noreply
|
217 |
```bash |
|
3551b80…
|
noreply
|
218 |
planopticon kg classify results/knowledge_graph.db |
|
3551b80…
|
noreply
|
219 |
planopticon kg classify results/knowledge_graph.db --format json |
|
3551b80…
|
noreply
|
220 |
``` |
|
3551b80…
|
noreply
|
221 |
|
|
3551b80…
|
noreply
|
222 |
### Export to other formats |
|
3551b80…
|
noreply
|
223 |
|
|
3551b80…
|
noreply
|
224 |
```bash |
|
3551b80…
|
noreply
|
225 |
# Generate markdown documents |
|
3551b80…
|
noreply
|
226 |
planopticon export markdown results/knowledge_graph.db -o ./docs |
|
3551b80…
|
noreply
|
227 |
|
|
3551b80…
|
noreply
|
228 |
# Export as Obsidian vault |
|
3551b80…
|
noreply
|
229 |
planopticon export obsidian results/knowledge_graph.db -o ./vault |
|
3551b80…
|
noreply
|
230 |
|
|
3551b80…
|
noreply
|
231 |
# Export as PlanOpticonExchange |
|
3551b80…
|
noreply
|
232 |
planopticon export exchange results/knowledge_graph.db -o exchange.json |
|
3551b80…
|
noreply
|
233 |
|
|
3551b80…
|
noreply
|
234 |
# Generate GitHub wiki |
|
3551b80…
|
noreply
|
235 |
planopticon wiki generate results/knowledge_graph.db -o ./wiki |
|
3551b80…
|
noreply
|
236 |
``` |
|
3551b80…
|
noreply
|
237 |
|
|
3551b80…
|
noreply
|
238 |
### Use with the planning agent |
|
3551b80…
|
noreply
|
239 |
|
|
3551b80…
|
noreply
|
240 |
The planning agent can consume the knowledge graph to generate project plans, PRDs, roadmaps, and other planning artifacts: |
|
3551b80…
|
noreply
|
241 |
|
|
3551b80…
|
noreply
|
242 |
```bash |
|
3551b80…
|
noreply
|
243 |
planopticon agent --db results/knowledge_graph.db |
|
f0106a3…
|
leo
|
244 |
``` |