PlanOpticon

planopticon / docs / guide / batch.md

Source Blame History 178 lines

f0106a3…	leo	1	# Batch Processing
f0106a3…	leo	2
f0106a3…	leo	3	## Basic usage
f0106a3…	leo	4
f0106a3…	leo	5	```bash
f0106a3…	leo	6	planopticon batch -i ./recordings -o ./output --title "Sprint Reviews"
f0106a3…	leo	7	```
f0106a3…	leo	8
f0106a3…	leo	9	## How it works
f0106a3…	leo	10
f0106a3…	leo	11	Batch mode:
f0106a3…	leo	12
f0106a3…	leo	13	1. Scans the input directory for video files matching the pattern
f0106a3…	leo	14	2. Processes each video through the full single-video pipeline
3551b80…	noreply	15	3. Merges knowledge graphs across all videos with fuzzy matching and conflict resolution
f0106a3…	leo	16	4. Generates a batch summary with aggregated stats and action items
f0106a3…	leo	17	5. Writes a batch manifest linking to per-video results
f0106a3…	leo	18
f0106a3…	leo	19	## File patterns
f0106a3…	leo	20
f0106a3…	leo	21	```bash
f0106a3…	leo	22	# Default: common video formats
f0106a3…	leo	23	planopticon batch -i ./recordings -o ./output
f0106a3…	leo	24
f0106a3…	leo	25	# Custom patterns
f0106a3…	leo	26	planopticon batch -i ./recordings -o ./output --pattern ".mp4,.mov"
f0106a3…	leo	27	```
f0106a3…	leo	28
f0106a3…	leo	29	## Output structure
f0106a3…	leo	30
f0106a3…	leo	31	```
f0106a3…	leo	32	output/
f0106a3…	leo	33	├── batch_manifest.json # Batch-level manifest
f0106a3…	leo	34	├── batch_summary.md # Aggregated summary
3551b80…	noreply	35	├── knowledge_graph.db # Merged KG across all videos (SQLite, primary)
3551b80…	noreply	36	├── knowledge_graph.json # Merged KG across all videos (JSON export)
f0106a3…	leo	37	└── videos/
f0106a3…	leo	38	├── meeting-01/
f0106a3…	leo	39	│ ├── manifest.json
f0106a3…	leo	40	│ ├── transcript/
f0106a3…	leo	41	│ ├── diagrams/
3551b80…	noreply	42	│ ├── captures/
f0106a3…	leo	43	│ └── results/
3551b80…	noreply	44	│ ├── analysis.md
3551b80…	noreply	45	│ ├── analysis.html
3551b80…	noreply	46	│ ├── knowledge_graph.db
3551b80…	noreply	47	│ ├── knowledge_graph.json
3551b80…	noreply	48	│ ├── key_points.json
3551b80…	noreply	49	│ └── action_items.json
f0106a3…	leo	50	└── meeting-02/
f0106a3…	leo	51	├── manifest.json
f0106a3…	leo	52	└── ...
f0106a3…	leo	53	```
f0106a3…	leo	54
f0106a3…	leo	55	## Knowledge graph merging
f0106a3…	leo	56
3551b80…	noreply	57	When the same entity appears across multiple videos, PlanOpticon merges them using a multi-strategy approach:
3551b80…	noreply	58
3551b80…	noreply	59	### Entity deduplication
3551b80…	noreply	60
3551b80…	noreply	61	- Case-insensitive exact matching -- `"kubernetes"` and `"Kubernetes"` are recognized as the same entity
3551b80…	noreply	62	- Fuzzy name matching -- Uses `SequenceMatcher` with a threshold of 0.85 to unify near-duplicate entities (e.g., `"K8s"` and `"k8s cluster"` may be matched depending on context)
3551b80…	noreply	63	- Descriptions are unioned -- All unique descriptions from each video are combined
3551b80…	noreply	64	- Occurrences are concatenated with source tracking -- Each occurrence retains its source video reference
3551b80…	noreply	65
3551b80…	noreply	66	### Relationship deduplication
3551b80…	noreply	67
3551b80…	noreply	68	- Relationships are deduplicated by (source, target, type) tuple
3551b80…	noreply	69	- Descriptions from duplicate relationships are merged
3551b80…	noreply	70
3551b80…	noreply	71	### Type conflict resolution
3551b80…	noreply	72
3551b80…	noreply	73	When the same entity appears with different types across videos, PlanOpticon uses a specificity ranking to resolve the conflict. More specific types are preferred over general ones:
3551b80…	noreply	74
3551b80…	noreply	75	- `technology` > `concept`
3551b80…	noreply	76	- `person` > `concept`
3551b80…	noreply	77	- `organization` > `concept`
3551b80…	noreply	78	- And so on through the full type hierarchy
3551b80…	noreply	79
3551b80…	noreply	80	This ensures that an entity initially classified as a generic `concept` in one video gets upgraded to `technology` if it is identified more specifically in another.
3551b80…	noreply	81
3551b80…	noreply	82	The merged knowledge graph is saved at the batch root in both SQLite (`knowledge_graph.db`) and JSON (`knowledge_graph.json`) formats, and is included in the batch summary as a Mermaid diagram.
f0106a3…	leo	83
f0106a3…	leo	84	## Error handling
f0106a3…	leo	85
f0106a3…	leo	86	If a video fails to process, the batch continues. Failed videos are recorded in the batch manifest with error details:
f0106a3…	leo	87
f0106a3…	leo	88	```json
f0106a3…	leo	89	{
f0106a3…	leo	90	"video_name": "corrupted-file",
f0106a3…	leo	91	"status": "failed",
f0106a3…	leo	92	"error": "Audio extraction failed: no audio track found"
f0106a3…	leo	93	}
3551b80…	noreply	94	```
3551b80…	noreply	95
3551b80…	noreply	96	The batch manifest tracks completion status:
3551b80…	noreply	97
3551b80…	noreply	98	```json
3551b80…	noreply	99	{
3551b80…	noreply	100	"title": "Sprint Reviews",
3551b80…	noreply	101	"total_videos": 5,
3551b80…	noreply	102	"completed_videos": 4,
3551b80…	noreply	103	"failed_videos": 1,
3551b80…	noreply	104	"total_diagrams": 12,
3551b80…	noreply	105	"total_action_items": 23,
3551b80…	noreply	106	"total_key_points": 45,
3551b80…	noreply	107	"videos": [...],
3551b80…	noreply	108	"batch_summary_md": "batch_summary.md",
3551b80…	noreply	109	"merged_knowledge_graph_json": "knowledge_graph.json",
3551b80…	noreply	110	"merged_knowledge_graph_db": "knowledge_graph.db"
3551b80…	noreply	111	}
3551b80…	noreply	112	```
3551b80…	noreply	113
3551b80…	noreply	114	## Using batch results
3551b80…	noreply	115
3551b80…	noreply	116	### Query the merged knowledge graph
3551b80…	noreply	117
3551b80…	noreply	118	After batch processing completes, the merged knowledge graph at the batch root contains entities and relationships from all successfully processed videos. You can query it just like a single-video knowledge graph:
3551b80…	noreply	119
3551b80…	noreply	120	```bash
3551b80…	noreply	121	# Show stats for the merged graph
3551b80…	noreply	122	planopticon query --db output/knowledge_graph.db
3551b80…	noreply	123
3551b80…	noreply	124	# List all people mentioned across all videos
3551b80…	noreply	125	planopticon query --db output/knowledge_graph.db "entities --type person"
3551b80…	noreply	126
3551b80…	noreply	127	# See what connects to an entity across all videos
3551b80…	noreply	128	planopticon query --db output/knowledge_graph.db "neighbors Alice"
3551b80…	noreply	129
3551b80…	noreply	130	# Ask natural language questions about the combined content
3551b80…	noreply	131	planopticon query --db output/knowledge_graph.db "What technologies were discussed across all meetings?"
3551b80…	noreply	132
3551b80…	noreply	133	# Interactive REPL for exploration
3551b80…	noreply	134	planopticon query --db output/knowledge_graph.db -I
3551b80…	noreply	135	```
3551b80…	noreply	136
3551b80…	noreply	137	### Export merged results
3551b80…	noreply	138
3551b80…	noreply	139	All export commands work with the merged knowledge graph:
3551b80…	noreply	140
3551b80…	noreply	141	```bash
3551b80…	noreply	142	# Generate documents from merged KG
3551b80…	noreply	143	planopticon export markdown output/knowledge_graph.db -o ./docs
3551b80…	noreply	144
3551b80…	noreply	145	# Export as Obsidian vault
3551b80…	noreply	146	planopticon export obsidian output/knowledge_graph.db -o ./vault
3551b80…	noreply	147
3551b80…	noreply	148	# Generate a project-wide exchange file
3551b80…	noreply	149	planopticon export exchange output/knowledge_graph.db --name "Sprint Reviews Q4"
3551b80…	noreply	150
3551b80…	noreply	151	# Generate a GitHub wiki
3551b80…	noreply	152	planopticon wiki generate output/knowledge_graph.db -o ./wiki
3551b80…	noreply	153	```
3551b80…	noreply	154
3551b80…	noreply	155	### Classify for planning
3551b80…	noreply	156
3551b80…	noreply	157	Run taxonomy classification on the merged graph to categorize entities across all videos:
3551b80…	noreply	158
3551b80…	noreply	159	```bash
3551b80…	noreply	160	planopticon kg classify output/knowledge_graph.db
3551b80…	noreply	161	```
3551b80…	noreply	162
3551b80…	noreply	163	### Use with the planning agent
3551b80…	noreply	164
3551b80…	noreply	165	The planning agent can consume the merged knowledge graph for cross-video analysis and planning:
3551b80…	noreply	166
3551b80…	noreply	167	```bash
3551b80…	noreply	168	planopticon agent --db output/knowledge_graph.db
3551b80…	noreply	169	```
3551b80…	noreply	170
3551b80…	noreply	171	### Incremental batch processing
3551b80…	noreply	172
3551b80…	noreply	173	If you add new videos to the recordings directory, you can re-run the batch command. Videos that have already been processed (with output directories present) will be detected via checkpoint/resume within each video's pipeline, making incremental processing efficient.
3551b80…	noreply	174
3551b80…	noreply	175	```bash
3551b80…	noreply	176	# Add new recordings to the folder, then re-run
3551b80…	noreply	177	planopticon batch -i ./recordings -o ./output --title "Sprint Reviews"
f0106a3…	leo	178	```

PlanOpticon

Keyboard Shortcuts