PlanOpticon

planopticon / docs / architecture / pipeline.md

Source Blame History 335 lines

f0106a3…	leo	1	# Processing Pipeline
f0106a3…	leo	2
3551b80…	noreply	3	PlanOpticon has four main pipelines: video analysis, document ingestion, source connector, and export. Each pipeline can operate independently, and they connect through the shared knowledge graph.
3551b80…	noreply	4
3551b80…	noreply	5	---
3551b80…	noreply	6
f0106a3…	leo	7	## Single video pipeline
3551b80…	noreply	8
3551b80…	noreply	9	The core video analysis pipeline processes a single video file through eight sequential steps with checkpoint/resume support.
f0106a3…	leo	10
f0106a3…	leo	11	```mermaid
f0106a3…	leo	12	sequenceDiagram
f0106a3…	leo	13	participant CLI
f0106a3…	leo	14	participant Pipeline
f0106a3…	leo	15	participant FrameExtractor
f0106a3…	leo	16	participant AudioExtractor
f0106a3…	leo	17	participant Provider
f0106a3…	leo	18	participant DiagramAnalyzer
f0106a3…	leo	19	participant KnowledgeGraph
3551b80…	noreply	20	participant Exporter
f0106a3…	leo	21
f0106a3…	leo	22	CLI->>Pipeline: process_single_video()
3551b80…	noreply	23
3551b80…	noreply	24	Note over Pipeline: Step 1: Extract frames
f0106a3…	leo	25	Pipeline->>FrameExtractor: extract_frames()
287a3bb…	leo	26	Note over FrameExtractor: Change detection + periodic capture (every 30s)
3551b80…	noreply	27	FrameExtractor-->>Pipeline: frame_paths[]
3551b80…	noreply	28
3551b80…	noreply	29	Note over Pipeline: Step 2: Filter people frames
287a3bb…	leo	30	Pipeline->>Pipeline: filter_people_frames()
287a3bb…	leo	31	Note over Pipeline: OpenCV face detection removes webcam/people frames
3551b80…	noreply	32
3551b80…	noreply	33	Note over Pipeline: Step 3: Extract + transcribe audio
f0106a3…	leo	34	Pipeline->>AudioExtractor: extract_audio()
f0106a3…	leo	35	Pipeline->>Provider: transcribe_audio()
3551b80…	noreply	36	Note over Provider: Supports speaker hints via --speakers flag
3551b80…	noreply	37
3551b80…	noreply	38	Note over Pipeline: Step 4: Analyze visuals
f0106a3…	leo	39	Pipeline->>DiagramAnalyzer: process_frames()
3551b80…	noreply	40	loop Each frame (up to 10 standard / 20 comprehensive)
f0106a3…	leo	41	DiagramAnalyzer->>Provider: classify (vision)
f0106a3…	leo	42	alt High confidence diagram
f0106a3…	leo	43	DiagramAnalyzer->>Provider: full analysis
3551b80…	noreply	44	Note over Provider: Extract description, text, mermaid, chart data
f0106a3…	leo	45	else Medium confidence
f0106a3…	leo	46	DiagramAnalyzer-->>Pipeline: screengrab fallback
f0106a3…	leo	47	end
f0106a3…	leo	48	end
f0106a3…	leo	49
3551b80…	noreply	50	Note over Pipeline: Step 5: Build knowledge graph
3551b80…	noreply	51	Pipeline->>KnowledgeGraph: register_source()
f0106a3…	leo	52	Pipeline->>KnowledgeGraph: process_transcript()
f0106a3…	leo	53	Pipeline->>KnowledgeGraph: process_diagrams()
3551b80…	noreply	54	Note over KnowledgeGraph: Writes knowledge_graph.db (SQLite) + .json
3551b80…	noreply	55
3551b80…	noreply	56	Note over Pipeline: Step 6: Extract key points + action items
f0106a3…	leo	57	Pipeline->>Provider: extract key points
f0106a3…	leo	58	Pipeline->>Provider: extract action items
3551b80…	noreply	59
3551b80…	noreply	60	Note over Pipeline: Step 7: Generate report
3551b80…	noreply	61	Pipeline->>Pipeline: generate markdown report
3551b80…	noreply	62	Note over Pipeline: Includes mermaid diagrams, tables, cross-references
3551b80…	noreply	63
3551b80…	noreply	64	Note over Pipeline: Step 8: Export formats
3551b80…	noreply	65	Pipeline->>Exporter: export_all_formats()
3551b80…	noreply	66	Note over Exporter: HTML report, PDF, SVG/PNG renderings, chart reproductions
3551b80…	noreply	67
f0106a3…	leo	68	Pipeline-->>CLI: VideoManifest
f0106a3…	leo	69	```
f0106a3…	leo	70
3551b80…	noreply	71	### Pipeline steps in detail
3551b80…	noreply	72
3551b80…	noreply	73	\| Step \| Name \| Checkpointable \| Description \|
3551b80…	noreply	74	\|------\|------\|----------------\|-------------\|
3551b80…	noreply	75	\| 1 \| Extract frames \| Yes \| Change detection + periodic capture. Skipped if `frames/frame_*.jpg` exist on disk. \|
3551b80…	noreply	76	\| 2 \| Filter people frames \| No \| Inline with step 1. OpenCV face detection removes webcam frames. \|
3551b80…	noreply	77	\| 3 \| Extract + transcribe audio \| Yes \| Skipped if `transcript/transcript.json` exists. Speaker hints passed if `--speakers` provided. \|
3551b80…	noreply	78	\| 4 \| Analyze visuals \| Yes \| Skipped if `diagrams/` is populated. Evenly samples frames (not just first N). \|
3551b80…	noreply	79	\| 5 \| Build knowledge graph \| Yes \| Skipped if `results/knowledge_graph.db` exists. Registers source, processes transcript and diagrams. \|
3551b80…	noreply	80	\| 6 \| Extract key points + actions \| Yes \| Skipped if `results/key_points.json` and `results/action_items.json` exist. \|
3551b80…	noreply	81	\| 7 \| Generate report \| Yes \| Skipped if `results/analysis.md` exists. \|
3551b80…	noreply	82	\| 8 \| Export formats \| No \| Always runs. Renders mermaid to SVG/PNG, reproduces charts, generates HTML/PDF. \|
3551b80…	noreply	83
3551b80…	noreply	84	---
3551b80…	noreply	85
f0106a3…	leo	86	## Batch pipeline
f0106a3…	leo	87
3551b80…	noreply	88	The batch pipeline wraps the single-video pipeline and adds cross-video knowledge graph merging.
3551b80…	noreply	89
3551b80…	noreply	90	```mermaid
3551b80…	noreply	91	flowchart TD
3551b80…	noreply	92	A[Scan input directory] --> B[Match video files by pattern]
3551b80…	noreply	93	B --> C{For each video}
3551b80…	noreply	94	C --> D[process_single_video]
3551b80…	noreply	95	D --> E{Success?}
3551b80…	noreply	96	E -->\|Yes\| F[Collect manifest + KG]
3551b80…	noreply	97	E -->\|No\| G[Log error, continue]
3551b80…	noreply	98	F --> H[Next video]
3551b80…	noreply	99	G --> H
3551b80…	noreply	100	H --> C
3551b80…	noreply	101	C -->\|All done\| I[Merge knowledge graphs]
3551b80…	noreply	102	I --> J[Fuzzy matching + conflict resolution]
3551b80…	noreply	103	J --> K[Generate batch summary]
3551b80…	noreply	104	K --> L[Write batch manifest]
3551b80…	noreply	105	L --> M[batch_manifest.json + batch_summary.md + merged KG]
3551b80…	noreply	106	```
3551b80…	noreply	107
3551b80…	noreply	108	### Knowledge graph merge strategy
3551b80…	noreply	109
3551b80…	noreply	110	During batch merging, `KnowledgeGraph.merge()` applies:
3551b80…	noreply	111
3551b80…	noreply	112	1. Case-insensitive exact matching for entity names
3551b80…	noreply	113	2. Fuzzy matching via `SequenceMatcher` (threshold >= 0.85) for near-duplicates
3551b80…	noreply	114	3. Type conflict resolution using a specificity ranking (e.g., `technology` > `concept`)
3551b80…	noreply	115	4. Description union across all sources
3551b80…	noreply	116	5. Relationship deduplication by (source, target, type) tuple
3551b80…	noreply	117
3551b80…	noreply	118	---
3551b80…	noreply	119
3551b80…	noreply	120	## Document ingestion pipeline
3551b80…	noreply	121
3551b80…	noreply	122	The document ingestion pipeline processes files (Markdown, plaintext, PDF) into knowledge graphs without video analysis.
3551b80…	noreply	123
3551b80…	noreply	124	```mermaid
3551b80…	noreply	125	flowchart TD
3551b80…	noreply	126	A[Input: file or directory] --> B{File or directory?}
3551b80…	noreply	127	B -->\|File\| C[get_processor by extension]
3551b80…	noreply	128	B -->\|Directory\| D[Glob for supported extensions]
3551b80…	noreply	129	D --> E{Recursive?}
3551b80…	noreply	130	E -->\|Yes\| F[rglob all files]
3551b80…	noreply	131	E -->\|No\| G[glob top-level only]
3551b80…	noreply	132	F --> H[For each file]
3551b80…	noreply	133	G --> H
3551b80…	noreply	134	H --> C
3551b80…	noreply	135	C --> I[DocumentProcessor.process]
3551b80…	noreply	136	I --> J[DocumentChunk list]
3551b80…	noreply	137	J --> K[Register source in KG]
3551b80…	noreply	138	K --> L[Add chunks as content]
3551b80…	noreply	139	L --> M[KG extracts entities + relationships]
3551b80…	noreply	140	M --> N[knowledge_graph.db]
3551b80…	noreply	141	```
3551b80…	noreply	142
3551b80…	noreply	143	### Supported document types
3551b80…	noreply	144
3551b80…	noreply	145	\| Extension \| Processor \| Notes \|
3551b80…	noreply	146	\|-----------\|-----------\|-------\|
3551b80…	noreply	147	\| `.md` \| `MarkdownProcessor` \| Splits by headings into sections \|
3551b80…	noreply	148	\| `.txt` \| `PlaintextProcessor` \| Splits into fixed-size chunks \|
3551b80…	noreply	149	\| `.pdf` \| `PdfProcessor` \| Requires `pymupdf` or `pdfplumber`. Falls back gracefully between libraries. \|
3551b80…	noreply	150
3551b80…	noreply	151	### Adding documents to an existing graph
3551b80…	noreply	152
3551b80…	noreply	153	The `--db-path` flag lets you ingest documents into an existing knowledge graph:
3551b80…	noreply	154
3551b80…	noreply	155	```bash
3551b80…	noreply	156	planopticon ingest spec.md --db-path existing.db
3551b80…	noreply	157	planopticon ingest ./docs/ -o ./output --recursive
3551b80…	noreply	158	```
3551b80…	noreply	159
3551b80…	noreply	160	---
3551b80…	noreply	161
3551b80…	noreply	162	## Source connector pipeline
3551b80…	noreply	163
3551b80…	noreply	164	Source connectors fetch content from cloud services, note-taking apps, and web sources. Each source implements the `BaseSource` ABC with three methods: `authenticate()`, `list_videos()`, and `download()`.
3551b80…	noreply	165
3551b80…	noreply	166	```mermaid
3551b80…	noreply	167	flowchart TD
3551b80…	noreply	168	A[Source command] --> B[Authenticate with provider]
3551b80…	noreply	169	B --> C{Auth success?}
3551b80…	noreply	170	C -->\|No\| D[Error: check credentials]
3551b80…	noreply	171	C -->\|Yes\| E[List files in folder]
3551b80…	noreply	172	E --> F[Filter by pattern / type]
3551b80…	noreply	173	F --> G[Download to local path]
3551b80…	noreply	174	G --> H{Analyze or ingest?}
3551b80…	noreply	175	H -->\|Video\| I[process_single_video / batch]
3551b80…	noreply	176	H -->\|Document\| J[ingest_file / ingest_directory]
3551b80…	noreply	177	I --> K[Knowledge graph]
3551b80…	noreply	178	J --> K
3551b80…	noreply	179	```
3551b80…	noreply	180
3551b80…	noreply	181	### Available sources
3551b80…	noreply	182
3551b80…	noreply	183	PlanOpticon includes connectors for:
3551b80…	noreply	184
3551b80…	noreply	185	\| Category \| Sources \|
3551b80…	noreply	186	\|----------\|---------\|
3551b80…	noreply	187	\| Cloud storage \| Google Drive, S3, Dropbox \|
3551b80…	noreply	188	\| Meeting recordings \| Zoom, Google Meet, Microsoft Teams \|
3551b80…	noreply	189	\| Productivity suites \| Google Workspace (Docs/Sheets/Slides), Microsoft 365 (SharePoint/OneDrive/OneNote) \|
3551b80…	noreply	190	\| Note-taking apps \| Obsidian, Logseq, Apple Notes, Google Keep, Notion \|
3551b80…	noreply	191	\| Web sources \| YouTube, Web (URL), RSS, Podcasts \|
3551b80…	noreply	192	\| Developer platforms \| GitHub, arXiv \|
3551b80…	noreply	193	\| Social media \| Reddit, Twitter/X, Hacker News \|
3551b80…	noreply	194
3551b80…	noreply	195	Each source authenticates via environment variables (API keys, OAuth tokens) specific to the provider.
3551b80…	noreply	196
3551b80…	noreply	197	---
3551b80…	noreply	198
3551b80…	noreply	199	## Planning agent pipeline
3551b80…	noreply	200
3551b80…	noreply	201	The planning agent consumes a knowledge graph and uses registered skills to generate planning artifacts.
3551b80…	noreply	202
3551b80…	noreply	203	```mermaid
3551b80…	noreply	204	flowchart TD
3551b80…	noreply	205	A[Knowledge graph] --> B[Load into AgentContext]
3551b80…	noreply	206	B --> C[GraphQueryEngine]
3551b80…	noreply	207	C --> D[Taxonomy classification]
3551b80…	noreply	208	D --> E[Agent orchestrator]
3551b80…	noreply	209	E --> F{Select skill}
3551b80…	noreply	210	F --> G[ProjectPlan skill]
3551b80…	noreply	211	F --> H[PRD skill]
3551b80…	noreply	212	F --> I[Roadmap skill]
3551b80…	noreply	213	F --> J[TaskBreakdown skill]
3551b80…	noreply	214	F --> K[DocGenerator skill]
3551b80…	noreply	215	F --> L[WikiGenerator skill]
3551b80…	noreply	216	F --> M[NotesExport skill]
3551b80…	noreply	217	F --> N[ArtifactExport skill]
3551b80…	noreply	218	F --> O[GitHubIntegration skill]
3551b80…	noreply	219	F --> P[RequirementsChat skill]
3551b80…	noreply	220	G --> Q[Artifact output]
3551b80…	noreply	221	H --> Q
3551b80…	noreply	222	I --> Q
3551b80…	noreply	223	J --> Q
3551b80…	noreply	224	K --> Q
3551b80…	noreply	225	L --> Q
3551b80…	noreply	226	M --> Q
3551b80…	noreply	227	N --> Q
3551b80…	noreply	228	O --> Q
3551b80…	noreply	229	P --> Q
3551b80…	noreply	230	Q --> R[Write to disk / push to service]
3551b80…	noreply	231	```
3551b80…	noreply	232
3551b80…	noreply	233	### Skill execution flow
3551b80…	noreply	234
3551b80…	noreply	235	1. The `AgentContext` is populated with the knowledge graph, query engine, provider manager, and any planning entities from taxonomy classification
3551b80…	noreply	236	2. Each `Skill` checks `can_execute()` against the context (requires at minimum a knowledge graph and provider manager)
3551b80…	noreply	237	3. The skill's `execute()` method generates an `Artifact` with a name, content, type, and format
3551b80…	noreply	238	4. Artifacts are collected and can be exported to disk or pushed to external services (GitHub issues, wiki pages, etc.)
3551b80…	noreply	239
3551b80…	noreply	240	---
3551b80…	noreply	241
3551b80…	noreply	242	## Export pipeline
3551b80…	noreply	243
3551b80…	noreply	244	The export pipeline converts knowledge graphs and analysis artifacts into various output formats.
3551b80…	noreply	245
3551b80…	noreply	246	```mermaid
3551b80…	noreply	247	flowchart TD
3551b80…	noreply	248	A[knowledge_graph.db] --> B{Export command}
3551b80…	noreply	249	B --> C[export markdown]
3551b80…	noreply	250	B --> D[export obsidian]
3551b80…	noreply	251	B --> E[export notion]
3551b80…	noreply	252	B --> F[export exchange]
3551b80…	noreply	253	B --> G[wiki generate]
3551b80…	noreply	254	B --> H[kg convert]
3551b80…	noreply	255	C --> I[7 document types + entity briefs + CSV]
3551b80…	noreply	256	D --> J[Obsidian vault with frontmatter + wiki-links]
3551b80…	noreply	257	E --> K[Notion-compatible markdown + CSV database]
3551b80…	noreply	258	F --> L[PlanOpticonExchange JSON payload]
3551b80…	noreply	259	G --> M[GitHub wiki pages + sidebar + home]
3551b80…	noreply	260	H --> N[Convert between .db / .json / .graphml / .csv]
3551b80…	noreply	261	```
3551b80…	noreply	262
3551b80…	noreply	263	All export commands accept a `knowledge_graph.db` (or `.json`) path as input. No API key is required for template-based exports (markdown, obsidian, notion, wiki, exchange, convert). Only the planning agent skills that generate new content require a provider.
3551b80…	noreply	264
3551b80…	noreply	265	---
3551b80…	noreply	266
3551b80…	noreply	267	## How pipelines connect
3551b80…	noreply	268
3551b80…	noreply	269	```mermaid
3551b80…	noreply	270	flowchart LR
3551b80…	noreply	271	V[Video files] --> VP[Video Pipeline]
3551b80…	noreply	272	D[Documents] --> DI[Document Ingestion]
3551b80…	noreply	273	S[Cloud Sources] --> SC[Source Connectors]
3551b80…	noreply	274	SC --> V
3551b80…	noreply	275	SC --> D
3551b80…	noreply	276	VP --> KG[(knowledge_graph.db)]
3551b80…	noreply	277	DI --> KG
3551b80…	noreply	278	KG --> QE[Query Engine]
3551b80…	noreply	279	KG --> EP[Export Pipeline]
3551b80…	noreply	280	KG --> PA[Planning Agent]
3551b80…	noreply	281	PA --> AR[Artifacts]
3551b80…	noreply	282	AR --> EP
3551b80…	noreply	283	```
3551b80…	noreply	284
3551b80…	noreply	285	All pipelines converge on the knowledge graph as the central data store. The knowledge graph is the shared interface between ingestion (video or document), querying, exporting, and planning.
f0106a3…	leo	286
3551b80…	noreply	287	---
f0106a3…	leo	288
f0106a3…	leo	289	## Error handling
f0106a3…	leo	290
3551b80…	noreply	291	Error handling follows consistent patterns across all pipelines:
3551b80…	noreply	292
3551b80…	noreply	293	\| Scenario \| Behavior \|
3551b80…	noreply	294	\|----------\|----------\|
3551b80…	noreply	295	\| Video fails in batch \| Batch continues. Failed video recorded in manifest with error details. \|
3551b80…	noreply	296	\| Diagram analysis fails \| Falls back to screengrab (captioned screenshot). \|
3551b80…	noreply	297	\| LLM extraction fails \| Returns empty results gracefully. Key points and action items will be empty arrays. \|
3551b80…	noreply	298	\| Document processor not found \| Raises `ValueError` with list of supported extensions. \|
3551b80…	noreply	299	\| Source authentication fails \| Returns `False` from `authenticate()`. CLI prints error message. \|
3551b80…	noreply	300	\| Checkpoint file found \| Step is skipped entirely and results are loaded from disk. \|
3551b80…	noreply	301	\| Progress callback fails \| Warning logged. Pipeline continues without progress updates. \|
3551b80…	noreply	302
3551b80…	noreply	303	---
3551b80…	noreply	304
3551b80…	noreply	305	## Progress callback system
3551b80…	noreply	306
3551b80…	noreply	307	The pipeline supports a `ProgressCallback` protocol for real-time progress tracking. This is used by the CLI's progress bars and can be implemented by external integrations (web UIs, CI systems, etc.).
3551b80…	noreply	308
3551b80…	noreply	309	```python
3551b80…	noreply	310	from video_processor.models import ProgressCallback
3551b80…	noreply	311
3551b80…	noreply	312	class MyCallback:
3551b80…	noreply	313	def on_step_start(self, step: str, index: int, total: int) -> None:
3551b80…	noreply	314	print(f"Starting step {index}/{total}: {step}")
3551b80…	noreply	315
3551b80…	noreply	316	def on_step_complete(self, step: str, index: int, total: int) -> None:
3551b80…	noreply	317	print(f"Completed step {index}/{total}: {step}")
3551b80…	noreply	318
3551b80…	noreply	319	def on_progress(self, step: str, percent: float, message: str = "") -> None:
3551b80…	noreply	320	print(f" {step}: {percent:.0%} {message}")
3551b80…	noreply	321	```
3551b80…	noreply	322
3551b80…	noreply	323	Pass the callback to `process_single_video()`:
3551b80…	noreply	324
3551b80…	noreply	325	```python
3551b80…	noreply	326	from video_processor.pipeline import process_single_video
3551b80…	noreply	327
3551b80…	noreply	328	manifest = process_single_video(
3551b80…	noreply	329	input_path="recording.mp4",
3551b80…	noreply	330	output_dir="./output",
3551b80…	noreply	331	progress_callback=MyCallback(),
3551b80…	noreply	332	)
3551b80…	noreply	333	```
3551b80…	noreply	334
3551b80…	noreply	335	The callback methods are called within a try/except wrapper, so a failing callback never interrupts the pipeline. If a callback method raises an exception, a warning is logged and processing continues.

PlanOpticon

Keyboard Shortcuts