PlanOpticon

planopticon / docs / api / analyzers.md
1
# Analyzers API Reference
2
3
::: video_processor.analyzers.diagram_analyzer
4
5
::: video_processor.analyzers.content_analyzer
6
7
::: video_processor.analyzers.action_detector
8
9
---
10
11
## Overview
12
13
The analyzers module contains the core content extraction logic for PlanOpticon. These analyzers process video frames and transcripts to extract structured knowledge: diagrams, key points, action items, and cross-referenced entities.
14
15
All analyzers accept an optional `ProviderManager` instance. When provided, they use LLM capabilities for richer extraction. Without one, they fall back to heuristic/pattern-based methods where possible.
16
17
---
18
19
## DiagramAnalyzer
20
21
```python
22
from video_processor.analyzers.diagram_analyzer import DiagramAnalyzer
23
```
24
25
Vision model-based diagram detection and analysis. Classifies video frames as diagrams, slides, screenshots, or other content, then performs full extraction on high-confidence frames.
26
27
### Constructor
28
29
```python
30
def __init__(
31
self,
32
provider_manager: Optional[ProviderManager] = None,
33
confidence_threshold: float = 0.3,
34
)
35
```
36
37
| Parameter | Type | Default | Description |
38
|---|---|---|---|
39
| `provider_manager` | `Optional[ProviderManager]` | `None` | LLM provider (creates a default if not provided) |
40
| `confidence_threshold` | `float` | `0.3` | Minimum confidence to process a frame at all |
41
42
### classify_frame()
43
44
```python
45
def classify_frame(self, image_path: Union[str, Path]) -> dict
46
```
47
48
Classify a single frame using a vision model. Determines whether the frame contains a diagram, slide, or other visual content worth extracting.
49
50
**Parameters:**
51
52
| Parameter | Type | Description |
53
|---|---|---|
54
| `image_path` | `Union[str, Path]` | Path to the frame image file |
55
56
**Returns:** `dict` with the following keys:
57
58
| Key | Type | Description |
59
|---|---|---|
60
| `is_diagram` | `bool` | Whether the frame contains extractable content |
61
| `diagram_type` | `str` | One of: `flowchart`, `sequence`, `architecture`, `whiteboard`, `chart`, `table`, `slide`, `screenshot`, `unknown` |
62
| `confidence` | `float` | Detection confidence from 0.0 to 1.0 |
63
| `content_type` | `str` | Content category: `slide`, `diagram`, `document`, `screen_share`, `whiteboard`, `chart`, `person`, `other` |
64
| `brief_description` | `str` | One-sentence description of the frame content |
65
66
**Important:** Frames showing people, webcam feeds, or video conference participant views return `confidence: 0.0`. The classifier is tuned to detect only shared/presented content.
67
68
```python
69
analyzer = DiagramAnalyzer()
70
result = analyzer.classify_frame("/path/to/frame_042.jpg")
71
if result["confidence"] >= 0.7:
72
print(f"Diagram detected: {result['diagram_type']}")
73
```
74
75
### analyze_diagram_single_pass()
76
77
```python
78
def analyze_diagram_single_pass(self, image_path: Union[str, Path]) -> dict
79
```
80
81
Full single-pass diagram analysis. Extracts description, text content, elements, relationships, Mermaid syntax, and chart data in a single LLM call.
82
83
**Returns:** `dict` with the following keys:
84
85
| Key | Type | Description |
86
|---|---|---|
87
| `diagram_type` | `str` | Diagram classification |
88
| `description` | `str` | Detailed description of the visual content |
89
| `text_content` | `str` | All visible text, preserving structure |
90
| `elements` | `list[str]` | Identified elements/components |
91
| `relationships` | `list[str]` | Relationships in `"A -> B: label"` format |
92
| `mermaid` | `str` | Valid Mermaid diagram syntax |
93
| `chart_data` | `dict \| None` | Chart data with `labels`, `values`, `chart_type` (only for data charts) |
94
95
Returns an empty `dict` on failure.
96
97
### caption_frame()
98
99
```python
100
def caption_frame(self, image_path: Union[str, Path]) -> str
101
```
102
103
Get a brief 1-2 sentence caption for a frame. Used as a fallback when full diagram analysis is not warranted.
104
105
**Returns:** `str` -- a brief description of the frame content.
106
107
### process_frames()
108
109
```python
110
def process_frames(
111
self,
112
frame_paths: List[Union[str, Path]],
113
diagrams_dir: Optional[Path] = None,
114
captures_dir: Optional[Path] = None,
115
) -> Tuple[List[DiagramResult], List[ScreenCapture]]
116
```
117
118
Process a batch of extracted video frames through the full classification and analysis pipeline.
119
120
**Parameters:**
121
122
| Parameter | Type | Default | Description |
123
|---|---|---|---|
124
| `frame_paths` | `List[Union[str, Path]]` | *required* | Paths to frame images |
125
| `diagrams_dir` | `Optional[Path]` | `None` | Output directory for diagram files (images, mermaid, JSON) |
126
| `captures_dir` | `Optional[Path]` | `None` | Output directory for screengrab fallback files |
127
128
**Returns:** `Tuple[List[DiagramResult], List[ScreenCapture]]`
129
130
**Confidence thresholds:**
131
132
| Confidence Range | Action |
133
|---|---|
134
| >= 0.7 | Full diagram analysis -- extracts elements, relationships, Mermaid syntax |
135
| 0.3 to 0.7 | Screengrab fallback -- saves frame with a brief caption |
136
| < 0.3 | Skipped entirely |
137
138
**Output files (when directories are provided):**
139
140
For diagrams (`diagrams_dir`):
141
142
- `diagram_N.jpg` -- original frame image
143
- `diagram_N.mermaid` -- Mermaid source (if generated)
144
- `diagram_N.json` -- full DiagramResult as JSON
145
146
For screen captures (`captures_dir`):
147
148
- `capture_N.jpg` -- original frame image
149
- `capture_N.json` -- ScreenCapture metadata as JSON
150
151
```python
152
from pathlib import Path
153
from video_processor.analyzers.diagram_analyzer import DiagramAnalyzer
154
from video_processor.providers.manager import ProviderManager
155
156
analyzer = DiagramAnalyzer(
157
provider_manager=ProviderManager(),
158
confidence_threshold=0.3,
159
)
160
161
frame_paths = list(Path("output/frames").glob("*.jpg"))
162
diagrams, captures = analyzer.process_frames(
163
frame_paths,
164
diagrams_dir=Path("output/diagrams"),
165
captures_dir=Path("output/captures"),
166
)
167
168
print(f"Found {len(diagrams)} diagrams, {len(captures)} screengrabs")
169
for d in diagrams:
170
print(f" [{d.diagram_type.value}] {d.description}")
171
```
172
173
---
174
175
## ContentAnalyzer
176
177
```python
178
from video_processor.analyzers.content_analyzer import ContentAnalyzer
179
```
180
181
Cross-references transcript and diagram entities for richer knowledge extraction. Merges entities found in different sources and enriches key points with diagram links.
182
183
### Constructor
184
185
```python
186
def __init__(self, provider_manager: Optional[ProviderManager] = None)
187
```
188
189
| Parameter | Type | Default | Description |
190
|---|---|---|---|
191
| `provider_manager` | `Optional[ProviderManager]` | `None` | Required for LLM-based fuzzy matching |
192
193
### cross_reference()
194
195
```python
196
def cross_reference(
197
self,
198
transcript_entities: List[Entity],
199
diagram_entities: List[Entity],
200
) -> List[Entity]
201
```
202
203
Merge entities from transcripts and diagrams into a unified list with source attribution.
204
205
**Merge strategy:**
206
207
1. Index all transcript entities by lowercase name, marked with `source="transcript"`
208
2. Merge diagram entities: if a name matches, set `source="both"` and combine descriptions/occurrences; otherwise add as `source="diagram"`
209
3. If a `ProviderManager` is available, use LLM fuzzy matching to find additional matches among unmatched entities (e.g., "PostgreSQL" from transcript matching "Postgres" from diagram)
210
211
**Parameters:**
212
213
| Parameter | Type | Description |
214
|---|---|---|
215
| `transcript_entities` | `List[Entity]` | Entities extracted from transcript |
216
| `diagram_entities` | `List[Entity]` | Entities extracted from diagrams |
217
218
**Returns:** `List[Entity]` -- merged entity list with `source` attribution.
219
220
```python
221
from video_processor.analyzers.content_analyzer import ContentAnalyzer
222
from video_processor.models import Entity
223
224
analyzer = ContentAnalyzer(provider_manager=pm)
225
226
transcript_entities = [
227
Entity(name="PostgreSQL", type="technology"),
228
Entity(name="Alice", type="person"),
229
]
230
diagram_entities = [
231
Entity(name="Postgres", type="technology"),
232
Entity(name="Redis", type="technology"),
233
]
234
235
merged = analyzer.cross_reference(transcript_entities, diagram_entities)
236
# "PostgreSQL" and "Postgres" may be fuzzy-matched and merged
237
```
238
239
### enrich_key_points()
240
241
```python
242
def enrich_key_points(
243
self,
244
key_points: List[KeyPoint],
245
diagrams: list,
246
transcript_text: str,
247
) -> List[KeyPoint]
248
```
249
250
Link key points to relevant diagrams by entity overlap. Examines word overlap between key point text and diagram elements/text content.
251
252
**Parameters:**
253
254
| Parameter | Type | Description |
255
|---|---|---|
256
| `key_points` | `List[KeyPoint]` | Key points to enrich |
257
| `diagrams` | `list` | List of `DiagramResult` objects or dicts |
258
| `transcript_text` | `str` | Full transcript text (reserved for future use) |
259
260
**Returns:** `List[KeyPoint]` -- key points with `related_diagrams` indices populated.
261
262
A key point is linked to a diagram when they share 2 or more words (excluding short words) between the key point text/details and the diagram's elements/text content.
263
264
---
265
266
## ActionDetector
267
268
```python
269
from video_processor.analyzers.action_detector import ActionDetector
270
```
271
272
Detects action items from transcripts and diagram content using LLM extraction with a regex pattern fallback.
273
274
### Constructor
275
276
```python
277
def __init__(self, provider_manager: Optional[ProviderManager] = None)
278
```
279
280
| Parameter | Type | Default | Description |
281
|---|---|---|---|
282
| `provider_manager` | `Optional[ProviderManager]` | `None` | Required for LLM-based extraction |
283
284
### detect_from_transcript()
285
286
```python
287
def detect_from_transcript(
288
self,
289
text: str,
290
segments: Optional[List[TranscriptSegment]] = None,
291
) -> List[ActionItem]
292
```
293
294
Detect action items from transcript text.
295
296
**Parameters:**
297
298
| Parameter | Type | Default | Description |
299
|---|---|---|---|
300
| `text` | `str` | *required* | Transcript text to analyze |
301
| `segments` | `Optional[List[TranscriptSegment]]` | `None` | Transcript segments for timestamp attachment |
302
303
**Returns:** `List[ActionItem]` -- detected action items with `source="transcript"`.
304
305
**Extraction modes:**
306
307
- **LLM mode** (when `provider_manager` is set): Sends the transcript to the LLM with a structured extraction prompt. Extracts action, assignee, deadline, priority, and context.
308
- **Pattern mode** (fallback): Matches sentences against regex patterns for action-oriented language.
309
310
**Pattern matching** detects sentences containing:
311
312
- "need/needs to", "should/must/shall"
313
- "will/going to", "action item/todo/follow-up"
314
- "assigned to/responsible for", "deadline/due by"
315
- "let's/let us", "make sure/ensure"
316
- "can you/could you/please"
317
318
**Timestamp attachment:** When `segments` are provided, each action item is matched to the most relevant transcript segment (by word overlap, minimum 3 matching words), and a timestamp is added to `context`.
319
320
### detect_from_diagrams()
321
322
```python
323
def detect_from_diagrams(self, diagrams: list) -> List[ActionItem]
324
```
325
326
Extract action items from diagram text content and elements. Processes each diagram's combined text using either LLM or pattern extraction.
327
328
**Parameters:**
329
330
| Parameter | Type | Description |
331
|---|---|---|
332
| `diagrams` | `list` | List of `DiagramResult` objects or dicts |
333
334
**Returns:** `List[ActionItem]` -- action items with `source="diagram"`.
335
336
### merge_action_items()
337
338
```python
339
def merge_action_items(
340
self,
341
transcript_items: List[ActionItem],
342
diagram_items: List[ActionItem],
343
) -> List[ActionItem]
344
```
345
346
Merge action items from multiple sources, deduplicating by action text (case-insensitive, whitespace-normalized).
347
348
**Returns:** `List[ActionItem]` -- deduplicated merged list.
349
350
### Usage example
351
352
```python
353
from video_processor.analyzers.action_detector import ActionDetector
354
from video_processor.providers.manager import ProviderManager
355
356
detector = ActionDetector(provider_manager=ProviderManager())
357
358
# From transcript
359
transcript_items = detector.detect_from_transcript(
360
text="Alice needs to update the API docs by Friday. "
361
"Bob should review the PR before merging.",
362
segments=transcript_segments,
363
)
364
365
# From diagrams
366
diagram_items = detector.detect_from_diagrams(diagram_results)
367
368
# Merge and deduplicate
369
all_items = detector.merge_action_items(transcript_items, diagram_items)
370
371
for item in all_items:
372
print(f"[{item.priority or 'unset'}] {item.action}")
373
if item.assignee:
374
print(f" Assignee: {item.assignee}")
375
if item.deadline:
376
print(f" Deadline: {item.deadline}")
377
```
378
379
### Pattern fallback (no LLM)
380
381
```python
382
# Works without any API keys
383
detector = ActionDetector() # No provider_manager
384
items = detector.detect_from_transcript(
385
"We need to finalize the database schema. "
386
"Please update the deployment scripts."
387
)
388
# Returns ActionItems matched by regex patterns
389
```
390

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button