PlanOpticon

planopticon / docs / architecture / pipeline.md
Source Blame History 335 lines
f0106a3… leo 1 # Processing Pipeline
f0106a3… leo 2
3551b80… noreply 3 PlanOpticon has four main pipelines: **video analysis**, **document ingestion**, **source connector**, and **export**. Each pipeline can operate independently, and they connect through the shared knowledge graph.
3551b80… noreply 4
3551b80… noreply 5 ---
3551b80… noreply 6
f0106a3… leo 7 ## Single video pipeline
3551b80… noreply 8
3551b80… noreply 9 The core video analysis pipeline processes a single video file through eight sequential steps with checkpoint/resume support.
f0106a3… leo 10
f0106a3… leo 11 ```mermaid
f0106a3… leo 12 sequenceDiagram
f0106a3… leo 13 participant CLI
f0106a3… leo 14 participant Pipeline
f0106a3… leo 15 participant FrameExtractor
f0106a3… leo 16 participant AudioExtractor
f0106a3… leo 17 participant Provider
f0106a3… leo 18 participant DiagramAnalyzer
f0106a3… leo 19 participant KnowledgeGraph
3551b80… noreply 20 participant Exporter
f0106a3… leo 21
f0106a3… leo 22 CLI->>Pipeline: process_single_video()
3551b80… noreply 23
3551b80… noreply 24 Note over Pipeline: Step 1: Extract frames
f0106a3… leo 25 Pipeline->>FrameExtractor: extract_frames()
287a3bb… leo 26 Note over FrameExtractor: Change detection + periodic capture (every 30s)
3551b80… noreply 27 FrameExtractor-->>Pipeline: frame_paths[]
3551b80… noreply 28
3551b80… noreply 29 Note over Pipeline: Step 2: Filter people frames
287a3bb… leo 30 Pipeline->>Pipeline: filter_people_frames()
287a3bb… leo 31 Note over Pipeline: OpenCV face detection removes webcam/people frames
3551b80… noreply 32
3551b80… noreply 33 Note over Pipeline: Step 3: Extract + transcribe audio
f0106a3… leo 34 Pipeline->>AudioExtractor: extract_audio()
f0106a3… leo 35 Pipeline->>Provider: transcribe_audio()
3551b80… noreply 36 Note over Provider: Supports speaker hints via --speakers flag
3551b80… noreply 37
3551b80… noreply 38 Note over Pipeline: Step 4: Analyze visuals
f0106a3… leo 39 Pipeline->>DiagramAnalyzer: process_frames()
3551b80… noreply 40 loop Each frame (up to 10 standard / 20 comprehensive)
f0106a3… leo 41 DiagramAnalyzer->>Provider: classify (vision)
f0106a3… leo 42 alt High confidence diagram
f0106a3… leo 43 DiagramAnalyzer->>Provider: full analysis
3551b80… noreply 44 Note over Provider: Extract description, text, mermaid, chart data
f0106a3… leo 45 else Medium confidence
f0106a3… leo 46 DiagramAnalyzer-->>Pipeline: screengrab fallback
f0106a3… leo 47 end
f0106a3… leo 48 end
f0106a3… leo 49
3551b80… noreply 50 Note over Pipeline: Step 5: Build knowledge graph
3551b80… noreply 51 Pipeline->>KnowledgeGraph: register_source()
f0106a3… leo 52 Pipeline->>KnowledgeGraph: process_transcript()
f0106a3… leo 53 Pipeline->>KnowledgeGraph: process_diagrams()
3551b80… noreply 54 Note over KnowledgeGraph: Writes knowledge_graph.db (SQLite) + .json
3551b80… noreply 55
3551b80… noreply 56 Note over Pipeline: Step 6: Extract key points + action items
f0106a3… leo 57 Pipeline->>Provider: extract key points
f0106a3… leo 58 Pipeline->>Provider: extract action items
3551b80… noreply 59
3551b80… noreply 60 Note over Pipeline: Step 7: Generate report
3551b80… noreply 61 Pipeline->>Pipeline: generate markdown report
3551b80… noreply 62 Note over Pipeline: Includes mermaid diagrams, tables, cross-references
3551b80… noreply 63
3551b80… noreply 64 Note over Pipeline: Step 8: Export formats
3551b80… noreply 65 Pipeline->>Exporter: export_all_formats()
3551b80… noreply 66 Note over Exporter: HTML report, PDF, SVG/PNG renderings, chart reproductions
3551b80… noreply 67
f0106a3… leo 68 Pipeline-->>CLI: VideoManifest
f0106a3… leo 69 ```
f0106a3… leo 70
3551b80… noreply 71 ### Pipeline steps in detail
3551b80… noreply 72
3551b80… noreply 73 | Step | Name | Checkpointable | Description |
3551b80… noreply 74 |------|------|----------------|-------------|
3551b80… noreply 75 | 1 | Extract frames | Yes | Change detection + periodic capture. Skipped if `frames/frame_*.jpg` exist on disk. |
3551b80… noreply 76 | 2 | Filter people frames | No | Inline with step 1. OpenCV face detection removes webcam frames. |
3551b80… noreply 77 | 3 | Extract + transcribe audio | Yes | Skipped if `transcript/transcript.json` exists. Speaker hints passed if `--speakers` provided. |
3551b80… noreply 78 | 4 | Analyze visuals | Yes | Skipped if `diagrams/` is populated. Evenly samples frames (not just first N). |
3551b80… noreply 79 | 5 | Build knowledge graph | Yes | Skipped if `results/knowledge_graph.db` exists. Registers source, processes transcript and diagrams. |
3551b80… noreply 80 | 6 | Extract key points + actions | Yes | Skipped if `results/key_points.json` and `results/action_items.json` exist. |
3551b80… noreply 81 | 7 | Generate report | Yes | Skipped if `results/analysis.md` exists. |
3551b80… noreply 82 | 8 | Export formats | No | Always runs. Renders mermaid to SVG/PNG, reproduces charts, generates HTML/PDF. |
3551b80… noreply 83
3551b80… noreply 84 ---
3551b80… noreply 85
f0106a3… leo 86 ## Batch pipeline
f0106a3… leo 87
3551b80… noreply 88 The batch pipeline wraps the single-video pipeline and adds cross-video knowledge graph merging.
3551b80… noreply 89
3551b80… noreply 90 ```mermaid
3551b80… noreply 91 flowchart TD
3551b80… noreply 92 A[Scan input directory] --> B[Match video files by pattern]
3551b80… noreply 93 B --> C{For each video}
3551b80… noreply 94 C --> D[process_single_video]
3551b80… noreply 95 D --> E{Success?}
3551b80… noreply 96 E -->|Yes| F[Collect manifest + KG]
3551b80… noreply 97 E -->|No| G[Log error, continue]
3551b80… noreply 98 F --> H[Next video]
3551b80… noreply 99 G --> H
3551b80… noreply 100 H --> C
3551b80… noreply 101 C -->|All done| I[Merge knowledge graphs]
3551b80… noreply 102 I --> J[Fuzzy matching + conflict resolution]
3551b80… noreply 103 J --> K[Generate batch summary]
3551b80… noreply 104 K --> L[Write batch manifest]
3551b80… noreply 105 L --> M[batch_manifest.json + batch_summary.md + merged KG]
3551b80… noreply 106 ```
3551b80… noreply 107
3551b80… noreply 108 ### Knowledge graph merge strategy
3551b80… noreply 109
3551b80… noreply 110 During batch merging, `KnowledgeGraph.merge()` applies:
3551b80… noreply 111
3551b80… noreply 112 1. **Case-insensitive exact matching** for entity names
3551b80… noreply 113 2. **Fuzzy matching** via `SequenceMatcher` (threshold >= 0.85) for near-duplicates
3551b80… noreply 114 3. **Type conflict resolution** using a specificity ranking (e.g., `technology` > `concept`)
3551b80… noreply 115 4. **Description union** across all sources
3551b80… noreply 116 5. **Relationship deduplication** by (source, target, type) tuple
3551b80… noreply 117
3551b80… noreply 118 ---
3551b80… noreply 119
3551b80… noreply 120 ## Document ingestion pipeline
3551b80… noreply 121
3551b80… noreply 122 The document ingestion pipeline processes files (Markdown, plaintext, PDF) into knowledge graphs without video analysis.
3551b80… noreply 123
3551b80… noreply 124 ```mermaid
3551b80… noreply 125 flowchart TD
3551b80… noreply 126 A[Input: file or directory] --> B{File or directory?}
3551b80… noreply 127 B -->|File| C[get_processor by extension]
3551b80… noreply 128 B -->|Directory| D[Glob for supported extensions]
3551b80… noreply 129 D --> E{Recursive?}
3551b80… noreply 130 E -->|Yes| F[rglob all files]
3551b80… noreply 131 E -->|No| G[glob top-level only]
3551b80… noreply 132 F --> H[For each file]
3551b80… noreply 133 G --> H
3551b80… noreply 134 H --> C
3551b80… noreply 135 C --> I[DocumentProcessor.process]
3551b80… noreply 136 I --> J[DocumentChunk list]
3551b80… noreply 137 J --> K[Register source in KG]
3551b80… noreply 138 K --> L[Add chunks as content]
3551b80… noreply 139 L --> M[KG extracts entities + relationships]
3551b80… noreply 140 M --> N[knowledge_graph.db]
3551b80… noreply 141 ```
3551b80… noreply 142
3551b80… noreply 143 ### Supported document types
3551b80… noreply 144
3551b80… noreply 145 | Extension | Processor | Notes |
3551b80… noreply 146 |-----------|-----------|-------|
3551b80… noreply 147 | `.md` | `MarkdownProcessor` | Splits by headings into sections |
3551b80… noreply 148 | `.txt` | `PlaintextProcessor` | Splits into fixed-size chunks |
3551b80… noreply 149 | `.pdf` | `PdfProcessor` | Requires `pymupdf` or `pdfplumber`. Falls back gracefully between libraries. |
3551b80… noreply 150
3551b80… noreply 151 ### Adding documents to an existing graph
3551b80… noreply 152
3551b80… noreply 153 The `--db-path` flag lets you ingest documents into an existing knowledge graph:
3551b80… noreply 154
3551b80… noreply 155 ```bash
3551b80… noreply 156 planopticon ingest spec.md --db-path existing.db
3551b80… noreply 157 planopticon ingest ./docs/ -o ./output --recursive
3551b80… noreply 158 ```
3551b80… noreply 159
3551b80… noreply 160 ---
3551b80… noreply 161
3551b80… noreply 162 ## Source connector pipeline
3551b80… noreply 163
3551b80… noreply 164 Source connectors fetch content from cloud services, note-taking apps, and web sources. Each source implements the `BaseSource` ABC with three methods: `authenticate()`, `list_videos()`, and `download()`.
3551b80… noreply 165
3551b80… noreply 166 ```mermaid
3551b80… noreply 167 flowchart TD
3551b80… noreply 168 A[Source command] --> B[Authenticate with provider]
3551b80… noreply 169 B --> C{Auth success?}
3551b80… noreply 170 C -->|No| D[Error: check credentials]
3551b80… noreply 171 C -->|Yes| E[List files in folder]
3551b80… noreply 172 E --> F[Filter by pattern / type]
3551b80… noreply 173 F --> G[Download to local path]
3551b80… noreply 174 G --> H{Analyze or ingest?}
3551b80… noreply 175 H -->|Video| I[process_single_video / batch]
3551b80… noreply 176 H -->|Document| J[ingest_file / ingest_directory]
3551b80… noreply 177 I --> K[Knowledge graph]
3551b80… noreply 178 J --> K
3551b80… noreply 179 ```
3551b80… noreply 180
3551b80… noreply 181 ### Available sources
3551b80… noreply 182
3551b80… noreply 183 PlanOpticon includes connectors for:
3551b80… noreply 184
3551b80… noreply 185 | Category | Sources |
3551b80… noreply 186 |----------|---------|
3551b80… noreply 187 | Cloud storage | Google Drive, S3, Dropbox |
3551b80… noreply 188 | Meeting recordings | Zoom, Google Meet, Microsoft Teams |
3551b80… noreply 189 | Productivity suites | Google Workspace (Docs/Sheets/Slides), Microsoft 365 (SharePoint/OneDrive/OneNote) |
3551b80… noreply 190 | Note-taking apps | Obsidian, Logseq, Apple Notes, Google Keep, Notion |
3551b80… noreply 191 | Web sources | YouTube, Web (URL), RSS, Podcasts |
3551b80… noreply 192 | Developer platforms | GitHub, arXiv |
3551b80… noreply 193 | Social media | Reddit, Twitter/X, Hacker News |
3551b80… noreply 194
3551b80… noreply 195 Each source authenticates via environment variables (API keys, OAuth tokens) specific to the provider.
3551b80… noreply 196
3551b80… noreply 197 ---
3551b80… noreply 198
3551b80… noreply 199 ## Planning agent pipeline
3551b80… noreply 200
3551b80… noreply 201 The planning agent consumes a knowledge graph and uses registered skills to generate planning artifacts.
3551b80… noreply 202
3551b80… noreply 203 ```mermaid
3551b80… noreply 204 flowchart TD
3551b80… noreply 205 A[Knowledge graph] --> B[Load into AgentContext]
3551b80… noreply 206 B --> C[GraphQueryEngine]
3551b80… noreply 207 C --> D[Taxonomy classification]
3551b80… noreply 208 D --> E[Agent orchestrator]
3551b80… noreply 209 E --> F{Select skill}
3551b80… noreply 210 F --> G[ProjectPlan skill]
3551b80… noreply 211 F --> H[PRD skill]
3551b80… noreply 212 F --> I[Roadmap skill]
3551b80… noreply 213 F --> J[TaskBreakdown skill]
3551b80… noreply 214 F --> K[DocGenerator skill]
3551b80… noreply 215 F --> L[WikiGenerator skill]
3551b80… noreply 216 F --> M[NotesExport skill]
3551b80… noreply 217 F --> N[ArtifactExport skill]
3551b80… noreply 218 F --> O[GitHubIntegration skill]
3551b80… noreply 219 F --> P[RequirementsChat skill]
3551b80… noreply 220 G --> Q[Artifact output]
3551b80… noreply 221 H --> Q
3551b80… noreply 222 I --> Q
3551b80… noreply 223 J --> Q
3551b80… noreply 224 K --> Q
3551b80… noreply 225 L --> Q
3551b80… noreply 226 M --> Q
3551b80… noreply 227 N --> Q
3551b80… noreply 228 O --> Q
3551b80… noreply 229 P --> Q
3551b80… noreply 230 Q --> R[Write to disk / push to service]
3551b80… noreply 231 ```
3551b80… noreply 232
3551b80… noreply 233 ### Skill execution flow
3551b80… noreply 234
3551b80… noreply 235 1. The `AgentContext` is populated with the knowledge graph, query engine, provider manager, and any planning entities from taxonomy classification
3551b80… noreply 236 2. Each `Skill` checks `can_execute()` against the context (requires at minimum a knowledge graph and provider manager)
3551b80… noreply 237 3. The skill's `execute()` method generates an `Artifact` with a name, content, type, and format
3551b80… noreply 238 4. Artifacts are collected and can be exported to disk or pushed to external services (GitHub issues, wiki pages, etc.)
3551b80… noreply 239
3551b80… noreply 240 ---
3551b80… noreply 241
3551b80… noreply 242 ## Export pipeline
3551b80… noreply 243
3551b80… noreply 244 The export pipeline converts knowledge graphs and analysis artifacts into various output formats.
3551b80… noreply 245
3551b80… noreply 246 ```mermaid
3551b80… noreply 247 flowchart TD
3551b80… noreply 248 A[knowledge_graph.db] --> B{Export command}
3551b80… noreply 249 B --> C[export markdown]
3551b80… noreply 250 B --> D[export obsidian]
3551b80… noreply 251 B --> E[export notion]
3551b80… noreply 252 B --> F[export exchange]
3551b80… noreply 253 B --> G[wiki generate]
3551b80… noreply 254 B --> H[kg convert]
3551b80… noreply 255 C --> I[7 document types + entity briefs + CSV]
3551b80… noreply 256 D --> J[Obsidian vault with frontmatter + wiki-links]
3551b80… noreply 257 E --> K[Notion-compatible markdown + CSV database]
3551b80… noreply 258 F --> L[PlanOpticonExchange JSON payload]
3551b80… noreply 259 G --> M[GitHub wiki pages + sidebar + home]
3551b80… noreply 260 H --> N[Convert between .db / .json / .graphml / .csv]
3551b80… noreply 261 ```
3551b80… noreply 262
3551b80… noreply 263 All export commands accept a `knowledge_graph.db` (or `.json`) path as input. No API key is required for template-based exports (markdown, obsidian, notion, wiki, exchange, convert). Only the planning agent skills that generate new content require a provider.
3551b80… noreply 264
3551b80… noreply 265 ---
3551b80… noreply 266
3551b80… noreply 267 ## How pipelines connect
3551b80… noreply 268
3551b80… noreply 269 ```mermaid
3551b80… noreply 270 flowchart LR
3551b80… noreply 271 V[Video files] --> VP[Video Pipeline]
3551b80… noreply 272 D[Documents] --> DI[Document Ingestion]
3551b80… noreply 273 S[Cloud Sources] --> SC[Source Connectors]
3551b80… noreply 274 SC --> V
3551b80… noreply 275 SC --> D
3551b80… noreply 276 VP --> KG[(knowledge_graph.db)]
3551b80… noreply 277 DI --> KG
3551b80… noreply 278 KG --> QE[Query Engine]
3551b80… noreply 279 KG --> EP[Export Pipeline]
3551b80… noreply 280 KG --> PA[Planning Agent]
3551b80… noreply 281 PA --> AR[Artifacts]
3551b80… noreply 282 AR --> EP
3551b80… noreply 283 ```
3551b80… noreply 284
3551b80… noreply 285 All pipelines converge on the knowledge graph as the central data store. The knowledge graph is the shared interface between ingestion (video or document), querying, exporting, and planning.
f0106a3… leo 286
3551b80… noreply 287 ---
f0106a3… leo 288
f0106a3… leo 289 ## Error handling
f0106a3… leo 290
3551b80… noreply 291 Error handling follows consistent patterns across all pipelines:
3551b80… noreply 292
3551b80… noreply 293 | Scenario | Behavior |
3551b80… noreply 294 |----------|----------|
3551b80… noreply 295 | Video fails in batch | Batch continues. Failed video recorded in manifest with error details. |
3551b80… noreply 296 | Diagram analysis fails | Falls back to screengrab (captioned screenshot). |
3551b80… noreply 297 | LLM extraction fails | Returns empty results gracefully. Key points and action items will be empty arrays. |
3551b80… noreply 298 | Document processor not found | Raises `ValueError` with list of supported extensions. |
3551b80… noreply 299 | Source authentication fails | Returns `False` from `authenticate()`. CLI prints error message. |
3551b80… noreply 300 | Checkpoint file found | Step is skipped entirely and results are loaded from disk. |
3551b80… noreply 301 | Progress callback fails | Warning logged. Pipeline continues without progress updates. |
3551b80… noreply 302
3551b80… noreply 303 ---
3551b80… noreply 304
3551b80… noreply 305 ## Progress callback system
3551b80… noreply 306
3551b80… noreply 307 The pipeline supports a `ProgressCallback` protocol for real-time progress tracking. This is used by the CLI's progress bars and can be implemented by external integrations (web UIs, CI systems, etc.).
3551b80… noreply 308
3551b80… noreply 309 ```python
3551b80… noreply 310 from video_processor.models import ProgressCallback
3551b80… noreply 311
3551b80… noreply 312 class MyCallback:
3551b80… noreply 313 def on_step_start(self, step: str, index: int, total: int) -> None:
3551b80… noreply 314 print(f"Starting step {index}/{total}: {step}")
3551b80… noreply 315
3551b80… noreply 316 def on_step_complete(self, step: str, index: int, total: int) -> None:
3551b80… noreply 317 print(f"Completed step {index}/{total}: {step}")
3551b80… noreply 318
3551b80… noreply 319 def on_progress(self, step: str, percent: float, message: str = "") -> None:
3551b80… noreply 320 print(f" {step}: {percent:.0%} {message}")
3551b80… noreply 321 ```
3551b80… noreply 322
3551b80… noreply 323 Pass the callback to `process_single_video()`:
3551b80… noreply 324
3551b80… noreply 325 ```python
3551b80… noreply 326 from video_processor.pipeline import process_single_video
3551b80… noreply 327
3551b80… noreply 328 manifest = process_single_video(
3551b80… noreply 329 input_path="recording.mp4",
3551b80… noreply 330 output_dir="./output",
3551b80… noreply 331 progress_callback=MyCallback(),
3551b80… noreply 332 )
3551b80… noreply 333 ```
3551b80… noreply 334
3551b80… noreply 335 The callback methods are called within a try/except wrapper, so a failing callback never interrupts the pipeline. If a callback method raises an exception, a warning is logged and processing continues.

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button