|
f0106a3…
|
leo
|
1 |
# Batch Processing |
|
f0106a3…
|
leo
|
2 |
|
|
f0106a3…
|
leo
|
3 |
## Basic usage |
|
f0106a3…
|
leo
|
4 |
|
|
f0106a3…
|
leo
|
5 |
```bash |
|
f0106a3…
|
leo
|
6 |
planopticon batch -i ./recordings -o ./output --title "Sprint Reviews" |
|
f0106a3…
|
leo
|
7 |
``` |
|
f0106a3…
|
leo
|
8 |
|
|
f0106a3…
|
leo
|
9 |
## How it works |
|
f0106a3…
|
leo
|
10 |
|
|
f0106a3…
|
leo
|
11 |
Batch mode: |
|
f0106a3…
|
leo
|
12 |
|
|
f0106a3…
|
leo
|
13 |
1. Scans the input directory for video files matching the pattern |
|
f0106a3…
|
leo
|
14 |
2. Processes each video through the full single-video pipeline |
|
3551b80…
|
noreply
|
15 |
3. Merges knowledge graphs across all videos with fuzzy matching and conflict resolution |
|
f0106a3…
|
leo
|
16 |
4. Generates a batch summary with aggregated stats and action items |
|
f0106a3…
|
leo
|
17 |
5. Writes a batch manifest linking to per-video results |
|
f0106a3…
|
leo
|
18 |
|
|
f0106a3…
|
leo
|
19 |
## File patterns |
|
f0106a3…
|
leo
|
20 |
|
|
f0106a3…
|
leo
|
21 |
```bash |
|
f0106a3…
|
leo
|
22 |
# Default: common video formats |
|
f0106a3…
|
leo
|
23 |
planopticon batch -i ./recordings -o ./output |
|
f0106a3…
|
leo
|
24 |
|
|
f0106a3…
|
leo
|
25 |
# Custom patterns |
|
f0106a3…
|
leo
|
26 |
planopticon batch -i ./recordings -o ./output --pattern "*.mp4,*.mov" |
|
f0106a3…
|
leo
|
27 |
``` |
|
f0106a3…
|
leo
|
28 |
|
|
f0106a3…
|
leo
|
29 |
## Output structure |
|
f0106a3…
|
leo
|
30 |
|
|
f0106a3…
|
leo
|
31 |
``` |
|
f0106a3…
|
leo
|
32 |
output/ |
|
f0106a3…
|
leo
|
33 |
├── batch_manifest.json # Batch-level manifest |
|
f0106a3…
|
leo
|
34 |
├── batch_summary.md # Aggregated summary |
|
3551b80…
|
noreply
|
35 |
├── knowledge_graph.db # Merged KG across all videos (SQLite, primary) |
|
3551b80…
|
noreply
|
36 |
├── knowledge_graph.json # Merged KG across all videos (JSON export) |
|
f0106a3…
|
leo
|
37 |
└── videos/ |
|
f0106a3…
|
leo
|
38 |
├── meeting-01/ |
|
f0106a3…
|
leo
|
39 |
│ ├── manifest.json |
|
f0106a3…
|
leo
|
40 |
│ ├── transcript/ |
|
f0106a3…
|
leo
|
41 |
│ ├── diagrams/ |
|
3551b80…
|
noreply
|
42 |
│ ├── captures/ |
|
f0106a3…
|
leo
|
43 |
│ └── results/ |
|
3551b80…
|
noreply
|
44 |
│ ├── analysis.md |
|
3551b80…
|
noreply
|
45 |
│ ├── analysis.html |
|
3551b80…
|
noreply
|
46 |
│ ├── knowledge_graph.db |
|
3551b80…
|
noreply
|
47 |
│ ├── knowledge_graph.json |
|
3551b80…
|
noreply
|
48 |
│ ├── key_points.json |
|
3551b80…
|
noreply
|
49 |
│ └── action_items.json |
|
f0106a3…
|
leo
|
50 |
└── meeting-02/ |
|
f0106a3…
|
leo
|
51 |
├── manifest.json |
|
f0106a3…
|
leo
|
52 |
└── ... |
|
f0106a3…
|
leo
|
53 |
``` |
|
f0106a3…
|
leo
|
54 |
|
|
f0106a3…
|
leo
|
55 |
## Knowledge graph merging |
|
f0106a3…
|
leo
|
56 |
|
|
3551b80…
|
noreply
|
57 |
When the same entity appears across multiple videos, PlanOpticon merges them using a multi-strategy approach: |
|
3551b80…
|
noreply
|
58 |
|
|
3551b80…
|
noreply
|
59 |
### Entity deduplication |
|
3551b80…
|
noreply
|
60 |
|
|
3551b80…
|
noreply
|
61 |
- **Case-insensitive exact matching** -- `"kubernetes"` and `"Kubernetes"` are recognized as the same entity |
|
3551b80…
|
noreply
|
62 |
- **Fuzzy name matching** -- Uses `SequenceMatcher` with a threshold of 0.85 to unify near-duplicate entities (e.g., `"K8s"` and `"k8s cluster"` may be matched depending on context) |
|
3551b80…
|
noreply
|
63 |
- **Descriptions are unioned** -- All unique descriptions from each video are combined |
|
3551b80…
|
noreply
|
64 |
- **Occurrences are concatenated with source tracking** -- Each occurrence retains its source video reference |
|
3551b80…
|
noreply
|
65 |
|
|
3551b80…
|
noreply
|
66 |
### Relationship deduplication |
|
3551b80…
|
noreply
|
67 |
|
|
3551b80…
|
noreply
|
68 |
- Relationships are deduplicated by (source, target, type) tuple |
|
3551b80…
|
noreply
|
69 |
- Descriptions from duplicate relationships are merged |
|
3551b80…
|
noreply
|
70 |
|
|
3551b80…
|
noreply
|
71 |
### Type conflict resolution |
|
3551b80…
|
noreply
|
72 |
|
|
3551b80…
|
noreply
|
73 |
When the same entity appears with different types across videos, PlanOpticon uses a specificity ranking to resolve the conflict. More specific types are preferred over general ones: |
|
3551b80…
|
noreply
|
74 |
|
|
3551b80…
|
noreply
|
75 |
- `technology` > `concept` |
|
3551b80…
|
noreply
|
76 |
- `person` > `concept` |
|
3551b80…
|
noreply
|
77 |
- `organization` > `concept` |
|
3551b80…
|
noreply
|
78 |
- And so on through the full type hierarchy |
|
3551b80…
|
noreply
|
79 |
|
|
3551b80…
|
noreply
|
80 |
This ensures that an entity initially classified as a generic `concept` in one video gets upgraded to `technology` if it is identified more specifically in another. |
|
3551b80…
|
noreply
|
81 |
|
|
3551b80…
|
noreply
|
82 |
The merged knowledge graph is saved at the batch root in both SQLite (`knowledge_graph.db`) and JSON (`knowledge_graph.json`) formats, and is included in the batch summary as a Mermaid diagram. |
|
f0106a3…
|
leo
|
83 |
|
|
f0106a3…
|
leo
|
84 |
## Error handling |
|
f0106a3…
|
leo
|
85 |
|
|
f0106a3…
|
leo
|
86 |
If a video fails to process, the batch continues. Failed videos are recorded in the batch manifest with error details: |
|
f0106a3…
|
leo
|
87 |
|
|
f0106a3…
|
leo
|
88 |
```json |
|
f0106a3…
|
leo
|
89 |
{ |
|
f0106a3…
|
leo
|
90 |
"video_name": "corrupted-file", |
|
f0106a3…
|
leo
|
91 |
"status": "failed", |
|
f0106a3…
|
leo
|
92 |
"error": "Audio extraction failed: no audio track found" |
|
f0106a3…
|
leo
|
93 |
} |
|
3551b80…
|
noreply
|
94 |
``` |
|
3551b80…
|
noreply
|
95 |
|
|
3551b80…
|
noreply
|
96 |
The batch manifest tracks completion status: |
|
3551b80…
|
noreply
|
97 |
|
|
3551b80…
|
noreply
|
98 |
```json |
|
3551b80…
|
noreply
|
99 |
{ |
|
3551b80…
|
noreply
|
100 |
"title": "Sprint Reviews", |
|
3551b80…
|
noreply
|
101 |
"total_videos": 5, |
|
3551b80…
|
noreply
|
102 |
"completed_videos": 4, |
|
3551b80…
|
noreply
|
103 |
"failed_videos": 1, |
|
3551b80…
|
noreply
|
104 |
"total_diagrams": 12, |
|
3551b80…
|
noreply
|
105 |
"total_action_items": 23, |
|
3551b80…
|
noreply
|
106 |
"total_key_points": 45, |
|
3551b80…
|
noreply
|
107 |
"videos": [...], |
|
3551b80…
|
noreply
|
108 |
"batch_summary_md": "batch_summary.md", |
|
3551b80…
|
noreply
|
109 |
"merged_knowledge_graph_json": "knowledge_graph.json", |
|
3551b80…
|
noreply
|
110 |
"merged_knowledge_graph_db": "knowledge_graph.db" |
|
3551b80…
|
noreply
|
111 |
} |
|
3551b80…
|
noreply
|
112 |
``` |
|
3551b80…
|
noreply
|
113 |
|
|
3551b80…
|
noreply
|
114 |
## Using batch results |
|
3551b80…
|
noreply
|
115 |
|
|
3551b80…
|
noreply
|
116 |
### Query the merged knowledge graph |
|
3551b80…
|
noreply
|
117 |
|
|
3551b80…
|
noreply
|
118 |
After batch processing completes, the merged knowledge graph at the batch root contains entities and relationships from all successfully processed videos. You can query it just like a single-video knowledge graph: |
|
3551b80…
|
noreply
|
119 |
|
|
3551b80…
|
noreply
|
120 |
```bash |
|
3551b80…
|
noreply
|
121 |
# Show stats for the merged graph |
|
3551b80…
|
noreply
|
122 |
planopticon query --db output/knowledge_graph.db |
|
3551b80…
|
noreply
|
123 |
|
|
3551b80…
|
noreply
|
124 |
# List all people mentioned across all videos |
|
3551b80…
|
noreply
|
125 |
planopticon query --db output/knowledge_graph.db "entities --type person" |
|
3551b80…
|
noreply
|
126 |
|
|
3551b80…
|
noreply
|
127 |
# See what connects to an entity across all videos |
|
3551b80…
|
noreply
|
128 |
planopticon query --db output/knowledge_graph.db "neighbors Alice" |
|
3551b80…
|
noreply
|
129 |
|
|
3551b80…
|
noreply
|
130 |
# Ask natural language questions about the combined content |
|
3551b80…
|
noreply
|
131 |
planopticon query --db output/knowledge_graph.db "What technologies were discussed across all meetings?" |
|
3551b80…
|
noreply
|
132 |
|
|
3551b80…
|
noreply
|
133 |
# Interactive REPL for exploration |
|
3551b80…
|
noreply
|
134 |
planopticon query --db output/knowledge_graph.db -I |
|
3551b80…
|
noreply
|
135 |
``` |
|
3551b80…
|
noreply
|
136 |
|
|
3551b80…
|
noreply
|
137 |
### Export merged results |
|
3551b80…
|
noreply
|
138 |
|
|
3551b80…
|
noreply
|
139 |
All export commands work with the merged knowledge graph: |
|
3551b80…
|
noreply
|
140 |
|
|
3551b80…
|
noreply
|
141 |
```bash |
|
3551b80…
|
noreply
|
142 |
# Generate documents from merged KG |
|
3551b80…
|
noreply
|
143 |
planopticon export markdown output/knowledge_graph.db -o ./docs |
|
3551b80…
|
noreply
|
144 |
|
|
3551b80…
|
noreply
|
145 |
# Export as Obsidian vault |
|
3551b80…
|
noreply
|
146 |
planopticon export obsidian output/knowledge_graph.db -o ./vault |
|
3551b80…
|
noreply
|
147 |
|
|
3551b80…
|
noreply
|
148 |
# Generate a project-wide exchange file |
|
3551b80…
|
noreply
|
149 |
planopticon export exchange output/knowledge_graph.db --name "Sprint Reviews Q4" |
|
3551b80…
|
noreply
|
150 |
|
|
3551b80…
|
noreply
|
151 |
# Generate a GitHub wiki |
|
3551b80…
|
noreply
|
152 |
planopticon wiki generate output/knowledge_graph.db -o ./wiki |
|
3551b80…
|
noreply
|
153 |
``` |
|
3551b80…
|
noreply
|
154 |
|
|
3551b80…
|
noreply
|
155 |
### Classify for planning |
|
3551b80…
|
noreply
|
156 |
|
|
3551b80…
|
noreply
|
157 |
Run taxonomy classification on the merged graph to categorize entities across all videos: |
|
3551b80…
|
noreply
|
158 |
|
|
3551b80…
|
noreply
|
159 |
```bash |
|
3551b80…
|
noreply
|
160 |
planopticon kg classify output/knowledge_graph.db |
|
3551b80…
|
noreply
|
161 |
``` |
|
3551b80…
|
noreply
|
162 |
|
|
3551b80…
|
noreply
|
163 |
### Use with the planning agent |
|
3551b80…
|
noreply
|
164 |
|
|
3551b80…
|
noreply
|
165 |
The planning agent can consume the merged knowledge graph for cross-video analysis and planning: |
|
3551b80…
|
noreply
|
166 |
|
|
3551b80…
|
noreply
|
167 |
```bash |
|
3551b80…
|
noreply
|
168 |
planopticon agent --db output/knowledge_graph.db |
|
3551b80…
|
noreply
|
169 |
``` |
|
3551b80…
|
noreply
|
170 |
|
|
3551b80…
|
noreply
|
171 |
### Incremental batch processing |
|
3551b80…
|
noreply
|
172 |
|
|
3551b80…
|
noreply
|
173 |
If you add new videos to the recordings directory, you can re-run the batch command. Videos that have already been processed (with output directories present) will be detected via checkpoint/resume within each video's pipeline, making incremental processing efficient. |
|
3551b80…
|
noreply
|
174 |
|
|
3551b80…
|
noreply
|
175 |
```bash |
|
3551b80…
|
noreply
|
176 |
# Add new recordings to the folder, then re-run |
|
3551b80…
|
noreply
|
177 |
planopticon batch -i ./recordings -o ./output --title "Sprint Reviews" |
|
f0106a3…
|
leo
|
178 |
``` |