|
f0106a3…
|
leo
|
1 |
# Processing Pipeline |
|
f0106a3…
|
leo
|
2 |
|
|
3551b80…
|
noreply
|
3 |
PlanOpticon has four main pipelines: **video analysis**, **document ingestion**, **source connector**, and **export**. Each pipeline can operate independently, and they connect through the shared knowledge graph. |
|
3551b80…
|
noreply
|
4 |
|
|
3551b80…
|
noreply
|
5 |
--- |
|
3551b80…
|
noreply
|
6 |
|
|
f0106a3…
|
leo
|
7 |
## Single video pipeline |
|
3551b80…
|
noreply
|
8 |
|
|
3551b80…
|
noreply
|
9 |
The core video analysis pipeline processes a single video file through eight sequential steps with checkpoint/resume support. |
|
f0106a3…
|
leo
|
10 |
|
|
f0106a3…
|
leo
|
11 |
```mermaid |
|
f0106a3…
|
leo
|
12 |
sequenceDiagram |
|
f0106a3…
|
leo
|
13 |
participant CLI |
|
f0106a3…
|
leo
|
14 |
participant Pipeline |
|
f0106a3…
|
leo
|
15 |
participant FrameExtractor |
|
f0106a3…
|
leo
|
16 |
participant AudioExtractor |
|
f0106a3…
|
leo
|
17 |
participant Provider |
|
f0106a3…
|
leo
|
18 |
participant DiagramAnalyzer |
|
f0106a3…
|
leo
|
19 |
participant KnowledgeGraph |
|
3551b80…
|
noreply
|
20 |
participant Exporter |
|
f0106a3…
|
leo
|
21 |
|
|
f0106a3…
|
leo
|
22 |
CLI->>Pipeline: process_single_video() |
|
3551b80…
|
noreply
|
23 |
|
|
3551b80…
|
noreply
|
24 |
Note over Pipeline: Step 1: Extract frames |
|
f0106a3…
|
leo
|
25 |
Pipeline->>FrameExtractor: extract_frames() |
|
287a3bb…
|
leo
|
26 |
Note over FrameExtractor: Change detection + periodic capture (every 30s) |
|
3551b80…
|
noreply
|
27 |
FrameExtractor-->>Pipeline: frame_paths[] |
|
3551b80…
|
noreply
|
28 |
|
|
3551b80…
|
noreply
|
29 |
Note over Pipeline: Step 2: Filter people frames |
|
287a3bb…
|
leo
|
30 |
Pipeline->>Pipeline: filter_people_frames() |
|
287a3bb…
|
leo
|
31 |
Note over Pipeline: OpenCV face detection removes webcam/people frames |
|
3551b80…
|
noreply
|
32 |
|
|
3551b80…
|
noreply
|
33 |
Note over Pipeline: Step 3: Extract + transcribe audio |
|
f0106a3…
|
leo
|
34 |
Pipeline->>AudioExtractor: extract_audio() |
|
f0106a3…
|
leo
|
35 |
Pipeline->>Provider: transcribe_audio() |
|
3551b80…
|
noreply
|
36 |
Note over Provider: Supports speaker hints via --speakers flag |
|
3551b80…
|
noreply
|
37 |
|
|
3551b80…
|
noreply
|
38 |
Note over Pipeline: Step 4: Analyze visuals |
|
f0106a3…
|
leo
|
39 |
Pipeline->>DiagramAnalyzer: process_frames() |
|
3551b80…
|
noreply
|
40 |
loop Each frame (up to 10 standard / 20 comprehensive) |
|
f0106a3…
|
leo
|
41 |
DiagramAnalyzer->>Provider: classify (vision) |
|
f0106a3…
|
leo
|
42 |
alt High confidence diagram |
|
f0106a3…
|
leo
|
43 |
DiagramAnalyzer->>Provider: full analysis |
|
3551b80…
|
noreply
|
44 |
Note over Provider: Extract description, text, mermaid, chart data |
|
f0106a3…
|
leo
|
45 |
else Medium confidence |
|
f0106a3…
|
leo
|
46 |
DiagramAnalyzer-->>Pipeline: screengrab fallback |
|
f0106a3…
|
leo
|
47 |
end |
|
f0106a3…
|
leo
|
48 |
end |
|
f0106a3…
|
leo
|
49 |
|
|
3551b80…
|
noreply
|
50 |
Note over Pipeline: Step 5: Build knowledge graph |
|
3551b80…
|
noreply
|
51 |
Pipeline->>KnowledgeGraph: register_source() |
|
f0106a3…
|
leo
|
52 |
Pipeline->>KnowledgeGraph: process_transcript() |
|
f0106a3…
|
leo
|
53 |
Pipeline->>KnowledgeGraph: process_diagrams() |
|
3551b80…
|
noreply
|
54 |
Note over KnowledgeGraph: Writes knowledge_graph.db (SQLite) + .json |
|
3551b80…
|
noreply
|
55 |
|
|
3551b80…
|
noreply
|
56 |
Note over Pipeline: Step 6: Extract key points + action items |
|
f0106a3…
|
leo
|
57 |
Pipeline->>Provider: extract key points |
|
f0106a3…
|
leo
|
58 |
Pipeline->>Provider: extract action items |
|
3551b80…
|
noreply
|
59 |
|
|
3551b80…
|
noreply
|
60 |
Note over Pipeline: Step 7: Generate report |
|
3551b80…
|
noreply
|
61 |
Pipeline->>Pipeline: generate markdown report |
|
3551b80…
|
noreply
|
62 |
Note over Pipeline: Includes mermaid diagrams, tables, cross-references |
|
3551b80…
|
noreply
|
63 |
|
|
3551b80…
|
noreply
|
64 |
Note over Pipeline: Step 8: Export formats |
|
3551b80…
|
noreply
|
65 |
Pipeline->>Exporter: export_all_formats() |
|
3551b80…
|
noreply
|
66 |
Note over Exporter: HTML report, PDF, SVG/PNG renderings, chart reproductions |
|
3551b80…
|
noreply
|
67 |
|
|
f0106a3…
|
leo
|
68 |
Pipeline-->>CLI: VideoManifest |
|
f0106a3…
|
leo
|
69 |
``` |
|
f0106a3…
|
leo
|
70 |
|
|
3551b80…
|
noreply
|
71 |
### Pipeline steps in detail |
|
3551b80…
|
noreply
|
72 |
|
|
3551b80…
|
noreply
|
73 |
| Step | Name | Checkpointable | Description | |
|
3551b80…
|
noreply
|
74 |
|------|------|----------------|-------------| |
|
3551b80…
|
noreply
|
75 |
| 1 | Extract frames | Yes | Change detection + periodic capture. Skipped if `frames/frame_*.jpg` exist on disk. | |
|
3551b80…
|
noreply
|
76 |
| 2 | Filter people frames | No | Inline with step 1. OpenCV face detection removes webcam frames. | |
|
3551b80…
|
noreply
|
77 |
| 3 | Extract + transcribe audio | Yes | Skipped if `transcript/transcript.json` exists. Speaker hints passed if `--speakers` provided. | |
|
3551b80…
|
noreply
|
78 |
| 4 | Analyze visuals | Yes | Skipped if `diagrams/` is populated. Evenly samples frames (not just first N). | |
|
3551b80…
|
noreply
|
79 |
| 5 | Build knowledge graph | Yes | Skipped if `results/knowledge_graph.db` exists. Registers source, processes transcript and diagrams. | |
|
3551b80…
|
noreply
|
80 |
| 6 | Extract key points + actions | Yes | Skipped if `results/key_points.json` and `results/action_items.json` exist. | |
|
3551b80…
|
noreply
|
81 |
| 7 | Generate report | Yes | Skipped if `results/analysis.md` exists. | |
|
3551b80…
|
noreply
|
82 |
| 8 | Export formats | No | Always runs. Renders mermaid to SVG/PNG, reproduces charts, generates HTML/PDF. | |
|
3551b80…
|
noreply
|
83 |
|
|
3551b80…
|
noreply
|
84 |
--- |
|
3551b80…
|
noreply
|
85 |
|
|
f0106a3…
|
leo
|
86 |
## Batch pipeline |
|
f0106a3…
|
leo
|
87 |
|
|
3551b80…
|
noreply
|
88 |
The batch pipeline wraps the single-video pipeline and adds cross-video knowledge graph merging. |
|
3551b80…
|
noreply
|
89 |
|
|
3551b80…
|
noreply
|
90 |
```mermaid |
|
3551b80…
|
noreply
|
91 |
flowchart TD |
|
3551b80…
|
noreply
|
92 |
A[Scan input directory] --> B[Match video files by pattern] |
|
3551b80…
|
noreply
|
93 |
B --> C{For each video} |
|
3551b80…
|
noreply
|
94 |
C --> D[process_single_video] |
|
3551b80…
|
noreply
|
95 |
D --> E{Success?} |
|
3551b80…
|
noreply
|
96 |
E -->|Yes| F[Collect manifest + KG] |
|
3551b80…
|
noreply
|
97 |
E -->|No| G[Log error, continue] |
|
3551b80…
|
noreply
|
98 |
F --> H[Next video] |
|
3551b80…
|
noreply
|
99 |
G --> H |
|
3551b80…
|
noreply
|
100 |
H --> C |
|
3551b80…
|
noreply
|
101 |
C -->|All done| I[Merge knowledge graphs] |
|
3551b80…
|
noreply
|
102 |
I --> J[Fuzzy matching + conflict resolution] |
|
3551b80…
|
noreply
|
103 |
J --> K[Generate batch summary] |
|
3551b80…
|
noreply
|
104 |
K --> L[Write batch manifest] |
|
3551b80…
|
noreply
|
105 |
L --> M[batch_manifest.json + batch_summary.md + merged KG] |
|
3551b80…
|
noreply
|
106 |
``` |
|
3551b80…
|
noreply
|
107 |
|
|
3551b80…
|
noreply
|
108 |
### Knowledge graph merge strategy |
|
3551b80…
|
noreply
|
109 |
|
|
3551b80…
|
noreply
|
110 |
During batch merging, `KnowledgeGraph.merge()` applies: |
|
3551b80…
|
noreply
|
111 |
|
|
3551b80…
|
noreply
|
112 |
1. **Case-insensitive exact matching** for entity names |
|
3551b80…
|
noreply
|
113 |
2. **Fuzzy matching** via `SequenceMatcher` (threshold >= 0.85) for near-duplicates |
|
3551b80…
|
noreply
|
114 |
3. **Type conflict resolution** using a specificity ranking (e.g., `technology` > `concept`) |
|
3551b80…
|
noreply
|
115 |
4. **Description union** across all sources |
|
3551b80…
|
noreply
|
116 |
5. **Relationship deduplication** by (source, target, type) tuple |
|
3551b80…
|
noreply
|
117 |
|
|
3551b80…
|
noreply
|
118 |
--- |
|
3551b80…
|
noreply
|
119 |
|
|
3551b80…
|
noreply
|
120 |
## Document ingestion pipeline |
|
3551b80…
|
noreply
|
121 |
|
|
3551b80…
|
noreply
|
122 |
The document ingestion pipeline processes files (Markdown, plaintext, PDF) into knowledge graphs without video analysis. |
|
3551b80…
|
noreply
|
123 |
|
|
3551b80…
|
noreply
|
124 |
```mermaid |
|
3551b80…
|
noreply
|
125 |
flowchart TD |
|
3551b80…
|
noreply
|
126 |
A[Input: file or directory] --> B{File or directory?} |
|
3551b80…
|
noreply
|
127 |
B -->|File| C[get_processor by extension] |
|
3551b80…
|
noreply
|
128 |
B -->|Directory| D[Glob for supported extensions] |
|
3551b80…
|
noreply
|
129 |
D --> E{Recursive?} |
|
3551b80…
|
noreply
|
130 |
E -->|Yes| F[rglob all files] |
|
3551b80…
|
noreply
|
131 |
E -->|No| G[glob top-level only] |
|
3551b80…
|
noreply
|
132 |
F --> H[For each file] |
|
3551b80…
|
noreply
|
133 |
G --> H |
|
3551b80…
|
noreply
|
134 |
H --> C |
|
3551b80…
|
noreply
|
135 |
C --> I[DocumentProcessor.process] |
|
3551b80…
|
noreply
|
136 |
I --> J[DocumentChunk list] |
|
3551b80…
|
noreply
|
137 |
J --> K[Register source in KG] |
|
3551b80…
|
noreply
|
138 |
K --> L[Add chunks as content] |
|
3551b80…
|
noreply
|
139 |
L --> M[KG extracts entities + relationships] |
|
3551b80…
|
noreply
|
140 |
M --> N[knowledge_graph.db] |
|
3551b80…
|
noreply
|
141 |
``` |
|
3551b80…
|
noreply
|
142 |
|
|
3551b80…
|
noreply
|
143 |
### Supported document types |
|
3551b80…
|
noreply
|
144 |
|
|
3551b80…
|
noreply
|
145 |
| Extension | Processor | Notes | |
|
3551b80…
|
noreply
|
146 |
|-----------|-----------|-------| |
|
3551b80…
|
noreply
|
147 |
| `.md` | `MarkdownProcessor` | Splits by headings into sections | |
|
3551b80…
|
noreply
|
148 |
| `.txt` | `PlaintextProcessor` | Splits into fixed-size chunks | |
|
3551b80…
|
noreply
|
149 |
| `.pdf` | `PdfProcessor` | Requires `pymupdf` or `pdfplumber`. Falls back gracefully between libraries. | |
|
3551b80…
|
noreply
|
150 |
|
|
3551b80…
|
noreply
|
151 |
### Adding documents to an existing graph |
|
3551b80…
|
noreply
|
152 |
|
|
3551b80…
|
noreply
|
153 |
The `--db-path` flag lets you ingest documents into an existing knowledge graph: |
|
3551b80…
|
noreply
|
154 |
|
|
3551b80…
|
noreply
|
155 |
```bash |
|
3551b80…
|
noreply
|
156 |
planopticon ingest spec.md --db-path existing.db |
|
3551b80…
|
noreply
|
157 |
planopticon ingest ./docs/ -o ./output --recursive |
|
3551b80…
|
noreply
|
158 |
``` |
|
3551b80…
|
noreply
|
159 |
|
|
3551b80…
|
noreply
|
160 |
--- |
|
3551b80…
|
noreply
|
161 |
|
|
3551b80…
|
noreply
|
162 |
## Source connector pipeline |
|
3551b80…
|
noreply
|
163 |
|
|
3551b80…
|
noreply
|
164 |
Source connectors fetch content from cloud services, note-taking apps, and web sources. Each source implements the `BaseSource` ABC with three methods: `authenticate()`, `list_videos()`, and `download()`. |
|
3551b80…
|
noreply
|
165 |
|
|
3551b80…
|
noreply
|
166 |
```mermaid |
|
3551b80…
|
noreply
|
167 |
flowchart TD |
|
3551b80…
|
noreply
|
168 |
A[Source command] --> B[Authenticate with provider] |
|
3551b80…
|
noreply
|
169 |
B --> C{Auth success?} |
|
3551b80…
|
noreply
|
170 |
C -->|No| D[Error: check credentials] |
|
3551b80…
|
noreply
|
171 |
C -->|Yes| E[List files in folder] |
|
3551b80…
|
noreply
|
172 |
E --> F[Filter by pattern / type] |
|
3551b80…
|
noreply
|
173 |
F --> G[Download to local path] |
|
3551b80…
|
noreply
|
174 |
G --> H{Analyze or ingest?} |
|
3551b80…
|
noreply
|
175 |
H -->|Video| I[process_single_video / batch] |
|
3551b80…
|
noreply
|
176 |
H -->|Document| J[ingest_file / ingest_directory] |
|
3551b80…
|
noreply
|
177 |
I --> K[Knowledge graph] |
|
3551b80…
|
noreply
|
178 |
J --> K |
|
3551b80…
|
noreply
|
179 |
``` |
|
3551b80…
|
noreply
|
180 |
|
|
3551b80…
|
noreply
|
181 |
### Available sources |
|
3551b80…
|
noreply
|
182 |
|
|
3551b80…
|
noreply
|
183 |
PlanOpticon includes connectors for: |
|
3551b80…
|
noreply
|
184 |
|
|
3551b80…
|
noreply
|
185 |
| Category | Sources | |
|
3551b80…
|
noreply
|
186 |
|----------|---------| |
|
3551b80…
|
noreply
|
187 |
| Cloud storage | Google Drive, S3, Dropbox | |
|
3551b80…
|
noreply
|
188 |
| Meeting recordings | Zoom, Google Meet, Microsoft Teams | |
|
3551b80…
|
noreply
|
189 |
| Productivity suites | Google Workspace (Docs/Sheets/Slides), Microsoft 365 (SharePoint/OneDrive/OneNote) | |
|
3551b80…
|
noreply
|
190 |
| Note-taking apps | Obsidian, Logseq, Apple Notes, Google Keep, Notion | |
|
3551b80…
|
noreply
|
191 |
| Web sources | YouTube, Web (URL), RSS, Podcasts | |
|
3551b80…
|
noreply
|
192 |
| Developer platforms | GitHub, arXiv | |
|
3551b80…
|
noreply
|
193 |
| Social media | Reddit, Twitter/X, Hacker News | |
|
3551b80…
|
noreply
|
194 |
|
|
3551b80…
|
noreply
|
195 |
Each source authenticates via environment variables (API keys, OAuth tokens) specific to the provider. |
|
3551b80…
|
noreply
|
196 |
|
|
3551b80…
|
noreply
|
197 |
--- |
|
3551b80…
|
noreply
|
198 |
|
|
3551b80…
|
noreply
|
199 |
## Planning agent pipeline |
|
3551b80…
|
noreply
|
200 |
|
|
3551b80…
|
noreply
|
201 |
The planning agent consumes a knowledge graph and uses registered skills to generate planning artifacts. |
|
3551b80…
|
noreply
|
202 |
|
|
3551b80…
|
noreply
|
203 |
```mermaid |
|
3551b80…
|
noreply
|
204 |
flowchart TD |
|
3551b80…
|
noreply
|
205 |
A[Knowledge graph] --> B[Load into AgentContext] |
|
3551b80…
|
noreply
|
206 |
B --> C[GraphQueryEngine] |
|
3551b80…
|
noreply
|
207 |
C --> D[Taxonomy classification] |
|
3551b80…
|
noreply
|
208 |
D --> E[Agent orchestrator] |
|
3551b80…
|
noreply
|
209 |
E --> F{Select skill} |
|
3551b80…
|
noreply
|
210 |
F --> G[ProjectPlan skill] |
|
3551b80…
|
noreply
|
211 |
F --> H[PRD skill] |
|
3551b80…
|
noreply
|
212 |
F --> I[Roadmap skill] |
|
3551b80…
|
noreply
|
213 |
F --> J[TaskBreakdown skill] |
|
3551b80…
|
noreply
|
214 |
F --> K[DocGenerator skill] |
|
3551b80…
|
noreply
|
215 |
F --> L[WikiGenerator skill] |
|
3551b80…
|
noreply
|
216 |
F --> M[NotesExport skill] |
|
3551b80…
|
noreply
|
217 |
F --> N[ArtifactExport skill] |
|
3551b80…
|
noreply
|
218 |
F --> O[GitHubIntegration skill] |
|
3551b80…
|
noreply
|
219 |
F --> P[RequirementsChat skill] |
|
3551b80…
|
noreply
|
220 |
G --> Q[Artifact output] |
|
3551b80…
|
noreply
|
221 |
H --> Q |
|
3551b80…
|
noreply
|
222 |
I --> Q |
|
3551b80…
|
noreply
|
223 |
J --> Q |
|
3551b80…
|
noreply
|
224 |
K --> Q |
|
3551b80…
|
noreply
|
225 |
L --> Q |
|
3551b80…
|
noreply
|
226 |
M --> Q |
|
3551b80…
|
noreply
|
227 |
N --> Q |
|
3551b80…
|
noreply
|
228 |
O --> Q |
|
3551b80…
|
noreply
|
229 |
P --> Q |
|
3551b80…
|
noreply
|
230 |
Q --> R[Write to disk / push to service] |
|
3551b80…
|
noreply
|
231 |
``` |
|
3551b80…
|
noreply
|
232 |
|
|
3551b80…
|
noreply
|
233 |
### Skill execution flow |
|
3551b80…
|
noreply
|
234 |
|
|
3551b80…
|
noreply
|
235 |
1. The `AgentContext` is populated with the knowledge graph, query engine, provider manager, and any planning entities from taxonomy classification |
|
3551b80…
|
noreply
|
236 |
2. Each `Skill` checks `can_execute()` against the context (requires at minimum a knowledge graph and provider manager) |
|
3551b80…
|
noreply
|
237 |
3. The skill's `execute()` method generates an `Artifact` with a name, content, type, and format |
|
3551b80…
|
noreply
|
238 |
4. Artifacts are collected and can be exported to disk or pushed to external services (GitHub issues, wiki pages, etc.) |
|
3551b80…
|
noreply
|
239 |
|
|
3551b80…
|
noreply
|
240 |
--- |
|
3551b80…
|
noreply
|
241 |
|
|
3551b80…
|
noreply
|
242 |
## Export pipeline |
|
3551b80…
|
noreply
|
243 |
|
|
3551b80…
|
noreply
|
244 |
The export pipeline converts knowledge graphs and analysis artifacts into various output formats. |
|
3551b80…
|
noreply
|
245 |
|
|
3551b80…
|
noreply
|
246 |
```mermaid |
|
3551b80…
|
noreply
|
247 |
flowchart TD |
|
3551b80…
|
noreply
|
248 |
A[knowledge_graph.db] --> B{Export command} |
|
3551b80…
|
noreply
|
249 |
B --> C[export markdown] |
|
3551b80…
|
noreply
|
250 |
B --> D[export obsidian] |
|
3551b80…
|
noreply
|
251 |
B --> E[export notion] |
|
3551b80…
|
noreply
|
252 |
B --> F[export exchange] |
|
3551b80…
|
noreply
|
253 |
B --> G[wiki generate] |
|
3551b80…
|
noreply
|
254 |
B --> H[kg convert] |
|
3551b80…
|
noreply
|
255 |
C --> I[7 document types + entity briefs + CSV] |
|
3551b80…
|
noreply
|
256 |
D --> J[Obsidian vault with frontmatter + wiki-links] |
|
3551b80…
|
noreply
|
257 |
E --> K[Notion-compatible markdown + CSV database] |
|
3551b80…
|
noreply
|
258 |
F --> L[PlanOpticonExchange JSON payload] |
|
3551b80…
|
noreply
|
259 |
G --> M[GitHub wiki pages + sidebar + home] |
|
3551b80…
|
noreply
|
260 |
H --> N[Convert between .db / .json / .graphml / .csv] |
|
3551b80…
|
noreply
|
261 |
``` |
|
3551b80…
|
noreply
|
262 |
|
|
3551b80…
|
noreply
|
263 |
All export commands accept a `knowledge_graph.db` (or `.json`) path as input. No API key is required for template-based exports (markdown, obsidian, notion, wiki, exchange, convert). Only the planning agent skills that generate new content require a provider. |
|
3551b80…
|
noreply
|
264 |
|
|
3551b80…
|
noreply
|
265 |
--- |
|
3551b80…
|
noreply
|
266 |
|
|
3551b80…
|
noreply
|
267 |
## How pipelines connect |
|
3551b80…
|
noreply
|
268 |
|
|
3551b80…
|
noreply
|
269 |
```mermaid |
|
3551b80…
|
noreply
|
270 |
flowchart LR |
|
3551b80…
|
noreply
|
271 |
V[Video files] --> VP[Video Pipeline] |
|
3551b80…
|
noreply
|
272 |
D[Documents] --> DI[Document Ingestion] |
|
3551b80…
|
noreply
|
273 |
S[Cloud Sources] --> SC[Source Connectors] |
|
3551b80…
|
noreply
|
274 |
SC --> V |
|
3551b80…
|
noreply
|
275 |
SC --> D |
|
3551b80…
|
noreply
|
276 |
VP --> KG[(knowledge_graph.db)] |
|
3551b80…
|
noreply
|
277 |
DI --> KG |
|
3551b80…
|
noreply
|
278 |
KG --> QE[Query Engine] |
|
3551b80…
|
noreply
|
279 |
KG --> EP[Export Pipeline] |
|
3551b80…
|
noreply
|
280 |
KG --> PA[Planning Agent] |
|
3551b80…
|
noreply
|
281 |
PA --> AR[Artifacts] |
|
3551b80…
|
noreply
|
282 |
AR --> EP |
|
3551b80…
|
noreply
|
283 |
``` |
|
3551b80…
|
noreply
|
284 |
|
|
3551b80…
|
noreply
|
285 |
All pipelines converge on the knowledge graph as the central data store. The knowledge graph is the shared interface between ingestion (video or document), querying, exporting, and planning. |
|
f0106a3…
|
leo
|
286 |
|
|
3551b80…
|
noreply
|
287 |
--- |
|
f0106a3…
|
leo
|
288 |
|
|
f0106a3…
|
leo
|
289 |
## Error handling |
|
f0106a3…
|
leo
|
290 |
|
|
3551b80…
|
noreply
|
291 |
Error handling follows consistent patterns across all pipelines: |
|
3551b80…
|
noreply
|
292 |
|
|
3551b80…
|
noreply
|
293 |
| Scenario | Behavior | |
|
3551b80…
|
noreply
|
294 |
|----------|----------| |
|
3551b80…
|
noreply
|
295 |
| Video fails in batch | Batch continues. Failed video recorded in manifest with error details. | |
|
3551b80…
|
noreply
|
296 |
| Diagram analysis fails | Falls back to screengrab (captioned screenshot). | |
|
3551b80…
|
noreply
|
297 |
| LLM extraction fails | Returns empty results gracefully. Key points and action items will be empty arrays. | |
|
3551b80…
|
noreply
|
298 |
| Document processor not found | Raises `ValueError` with list of supported extensions. | |
|
3551b80…
|
noreply
|
299 |
| Source authentication fails | Returns `False` from `authenticate()`. CLI prints error message. | |
|
3551b80…
|
noreply
|
300 |
| Checkpoint file found | Step is skipped entirely and results are loaded from disk. | |
|
3551b80…
|
noreply
|
301 |
| Progress callback fails | Warning logged. Pipeline continues without progress updates. | |
|
3551b80…
|
noreply
|
302 |
|
|
3551b80…
|
noreply
|
303 |
--- |
|
3551b80…
|
noreply
|
304 |
|
|
3551b80…
|
noreply
|
305 |
## Progress callback system |
|
3551b80…
|
noreply
|
306 |
|
|
3551b80…
|
noreply
|
307 |
The pipeline supports a `ProgressCallback` protocol for real-time progress tracking. This is used by the CLI's progress bars and can be implemented by external integrations (web UIs, CI systems, etc.). |
|
3551b80…
|
noreply
|
308 |
|
|
3551b80…
|
noreply
|
309 |
```python |
|
3551b80…
|
noreply
|
310 |
from video_processor.models import ProgressCallback |
|
3551b80…
|
noreply
|
311 |
|
|
3551b80…
|
noreply
|
312 |
class MyCallback: |
|
3551b80…
|
noreply
|
313 |
def on_step_start(self, step: str, index: int, total: int) -> None: |
|
3551b80…
|
noreply
|
314 |
print(f"Starting step {index}/{total}: {step}") |
|
3551b80…
|
noreply
|
315 |
|
|
3551b80…
|
noreply
|
316 |
def on_step_complete(self, step: str, index: int, total: int) -> None: |
|
3551b80…
|
noreply
|
317 |
print(f"Completed step {index}/{total}: {step}") |
|
3551b80…
|
noreply
|
318 |
|
|
3551b80…
|
noreply
|
319 |
def on_progress(self, step: str, percent: float, message: str = "") -> None: |
|
3551b80…
|
noreply
|
320 |
print(f" {step}: {percent:.0%} {message}") |
|
3551b80…
|
noreply
|
321 |
``` |
|
3551b80…
|
noreply
|
322 |
|
|
3551b80…
|
noreply
|
323 |
Pass the callback to `process_single_video()`: |
|
3551b80…
|
noreply
|
324 |
|
|
3551b80…
|
noreply
|
325 |
```python |
|
3551b80…
|
noreply
|
326 |
from video_processor.pipeline import process_single_video |
|
3551b80…
|
noreply
|
327 |
|
|
3551b80…
|
noreply
|
328 |
manifest = process_single_video( |
|
3551b80…
|
noreply
|
329 |
input_path="recording.mp4", |
|
3551b80…
|
noreply
|
330 |
output_dir="./output", |
|
3551b80…
|
noreply
|
331 |
progress_callback=MyCallback(), |
|
3551b80…
|
noreply
|
332 |
) |
|
3551b80…
|
noreply
|
333 |
``` |
|
3551b80…
|
noreply
|
334 |
|
|
3551b80…
|
noreply
|
335 |
The callback methods are called within a try/except wrapper, so a failing callback never interrupts the pipeline. If a callback method raises an exception, a warning is logged and processing continues. |