PlanOpticon

planopticon / docs / api / analyzers.md
Source Blame History 389 lines
f0106a3… leo 1 # Analyzers API Reference
f0106a3… leo 2
f0106a3… leo 3 ::: video_processor.analyzers.diagram_analyzer
f0106a3… leo 4
f0106a3… leo 5 ::: video_processor.analyzers.content_analyzer
f0106a3… leo 6
f0106a3… leo 7 ::: video_processor.analyzers.action_detector
3551b80… noreply 8
3551b80… noreply 9 ---
3551b80… noreply 10
3551b80… noreply 11 ## Overview
3551b80… noreply 12
3551b80… noreply 13 The analyzers module contains the core content extraction logic for PlanOpticon. These analyzers process video frames and transcripts to extract structured knowledge: diagrams, key points, action items, and cross-referenced entities.
3551b80… noreply 14
3551b80… noreply 15 All analyzers accept an optional `ProviderManager` instance. When provided, they use LLM capabilities for richer extraction. Without one, they fall back to heuristic/pattern-based methods where possible.
3551b80… noreply 16
3551b80… noreply 17 ---
3551b80… noreply 18
3551b80… noreply 19 ## DiagramAnalyzer
3551b80… noreply 20
3551b80… noreply 21 ```python
3551b80… noreply 22 from video_processor.analyzers.diagram_analyzer import DiagramAnalyzer
3551b80… noreply 23 ```
3551b80… noreply 24
3551b80… noreply 25 Vision model-based diagram detection and analysis. Classifies video frames as diagrams, slides, screenshots, or other content, then performs full extraction on high-confidence frames.
3551b80… noreply 26
3551b80… noreply 27 ### Constructor
3551b80… noreply 28
3551b80… noreply 29 ```python
3551b80… noreply 30 def __init__(
3551b80… noreply 31 self,
3551b80… noreply 32 provider_manager: Optional[ProviderManager] = None,
3551b80… noreply 33 confidence_threshold: float = 0.3,
3551b80… noreply 34 )
3551b80… noreply 35 ```
3551b80… noreply 36
3551b80… noreply 37 | Parameter | Type | Default | Description |
3551b80… noreply 38 |---|---|---|---|
3551b80… noreply 39 | `provider_manager` | `Optional[ProviderManager]` | `None` | LLM provider (creates a default if not provided) |
3551b80… noreply 40 | `confidence_threshold` | `float` | `0.3` | Minimum confidence to process a frame at all |
3551b80… noreply 41
3551b80… noreply 42 ### classify_frame()
3551b80… noreply 43
3551b80… noreply 44 ```python
3551b80… noreply 45 def classify_frame(self, image_path: Union[str, Path]) -> dict
3551b80… noreply 46 ```
3551b80… noreply 47
3551b80… noreply 48 Classify a single frame using a vision model. Determines whether the frame contains a diagram, slide, or other visual content worth extracting.
3551b80… noreply 49
3551b80… noreply 50 **Parameters:**
3551b80… noreply 51
3551b80… noreply 52 | Parameter | Type | Description |
3551b80… noreply 53 |---|---|---|
3551b80… noreply 54 | `image_path` | `Union[str, Path]` | Path to the frame image file |
3551b80… noreply 55
3551b80… noreply 56 **Returns:** `dict` with the following keys:
3551b80… noreply 57
3551b80… noreply 58 | Key | Type | Description |
3551b80… noreply 59 |---|---|---|
3551b80… noreply 60 | `is_diagram` | `bool` | Whether the frame contains extractable content |
3551b80… noreply 61 | `diagram_type` | `str` | One of: `flowchart`, `sequence`, `architecture`, `whiteboard`, `chart`, `table`, `slide`, `screenshot`, `unknown` |
3551b80… noreply 62 | `confidence` | `float` | Detection confidence from 0.0 to 1.0 |
3551b80… noreply 63 | `content_type` | `str` | Content category: `slide`, `diagram`, `document`, `screen_share`, `whiteboard`, `chart`, `person`, `other` |
3551b80… noreply 64 | `brief_description` | `str` | One-sentence description of the frame content |
3551b80… noreply 65
3551b80… noreply 66 **Important:** Frames showing people, webcam feeds, or video conference participant views return `confidence: 0.0`. The classifier is tuned to detect only shared/presented content.
3551b80… noreply 67
3551b80… noreply 68 ```python
3551b80… noreply 69 analyzer = DiagramAnalyzer()
3551b80… noreply 70 result = analyzer.classify_frame("/path/to/frame_042.jpg")
3551b80… noreply 71 if result["confidence"] >= 0.7:
3551b80… noreply 72 print(f"Diagram detected: {result['diagram_type']}")
3551b80… noreply 73 ```
3551b80… noreply 74
3551b80… noreply 75 ### analyze_diagram_single_pass()
3551b80… noreply 76
3551b80… noreply 77 ```python
3551b80… noreply 78 def analyze_diagram_single_pass(self, image_path: Union[str, Path]) -> dict
3551b80… noreply 79 ```
3551b80… noreply 80
3551b80… noreply 81 Full single-pass diagram analysis. Extracts description, text content, elements, relationships, Mermaid syntax, and chart data in a single LLM call.
3551b80… noreply 82
3551b80… noreply 83 **Returns:** `dict` with the following keys:
3551b80… noreply 84
3551b80… noreply 85 | Key | Type | Description |
3551b80… noreply 86 |---|---|---|
3551b80… noreply 87 | `diagram_type` | `str` | Diagram classification |
3551b80… noreply 88 | `description` | `str` | Detailed description of the visual content |
3551b80… noreply 89 | `text_content` | `str` | All visible text, preserving structure |
3551b80… noreply 90 | `elements` | `list[str]` | Identified elements/components |
3551b80… noreply 91 | `relationships` | `list[str]` | Relationships in `"A -> B: label"` format |
3551b80… noreply 92 | `mermaid` | `str` | Valid Mermaid diagram syntax |
3551b80… noreply 93 | `chart_data` | `dict \| None` | Chart data with `labels`, `values`, `chart_type` (only for data charts) |
3551b80… noreply 94
3551b80… noreply 95 Returns an empty `dict` on failure.
3551b80… noreply 96
3551b80… noreply 97 ### caption_frame()
3551b80… noreply 98
3551b80… noreply 99 ```python
3551b80… noreply 100 def caption_frame(self, image_path: Union[str, Path]) -> str
3551b80… noreply 101 ```
3551b80… noreply 102
3551b80… noreply 103 Get a brief 1-2 sentence caption for a frame. Used as a fallback when full diagram analysis is not warranted.
3551b80… noreply 104
3551b80… noreply 105 **Returns:** `str` -- a brief description of the frame content.
3551b80… noreply 106
3551b80… noreply 107 ### process_frames()
3551b80… noreply 108
3551b80… noreply 109 ```python
3551b80… noreply 110 def process_frames(
3551b80… noreply 111 self,
3551b80… noreply 112 frame_paths: List[Union[str, Path]],
3551b80… noreply 113 diagrams_dir: Optional[Path] = None,
3551b80… noreply 114 captures_dir: Optional[Path] = None,
3551b80… noreply 115 ) -> Tuple[List[DiagramResult], List[ScreenCapture]]
3551b80… noreply 116 ```
3551b80… noreply 117
3551b80… noreply 118 Process a batch of extracted video frames through the full classification and analysis pipeline.
3551b80… noreply 119
3551b80… noreply 120 **Parameters:**
3551b80… noreply 121
3551b80… noreply 122 | Parameter | Type | Default | Description |
3551b80… noreply 123 |---|---|---|---|
3551b80… noreply 124 | `frame_paths` | `List[Union[str, Path]]` | *required* | Paths to frame images |
3551b80… noreply 125 | `diagrams_dir` | `Optional[Path]` | `None` | Output directory for diagram files (images, mermaid, JSON) |
3551b80… noreply 126 | `captures_dir` | `Optional[Path]` | `None` | Output directory for screengrab fallback files |
3551b80… noreply 127
3551b80… noreply 128 **Returns:** `Tuple[List[DiagramResult], List[ScreenCapture]]`
3551b80… noreply 129
3551b80… noreply 130 **Confidence thresholds:**
3551b80… noreply 131
3551b80… noreply 132 | Confidence Range | Action |
3551b80… noreply 133 |---|---|
3551b80… noreply 134 | >= 0.7 | Full diagram analysis -- extracts elements, relationships, Mermaid syntax |
3551b80… noreply 135 | 0.3 to 0.7 | Screengrab fallback -- saves frame with a brief caption |
3551b80… noreply 136 | < 0.3 | Skipped entirely |
3551b80… noreply 137
3551b80… noreply 138 **Output files (when directories are provided):**
3551b80… noreply 139
3551b80… noreply 140 For diagrams (`diagrams_dir`):
3551b80… noreply 141
3551b80… noreply 142 - `diagram_N.jpg` -- original frame image
3551b80… noreply 143 - `diagram_N.mermaid` -- Mermaid source (if generated)
3551b80… noreply 144 - `diagram_N.json` -- full DiagramResult as JSON
3551b80… noreply 145
3551b80… noreply 146 For screen captures (`captures_dir`):
3551b80… noreply 147
3551b80… noreply 148 - `capture_N.jpg` -- original frame image
3551b80… noreply 149 - `capture_N.json` -- ScreenCapture metadata as JSON
3551b80… noreply 150
3551b80… noreply 151 ```python
3551b80… noreply 152 from pathlib import Path
3551b80… noreply 153 from video_processor.analyzers.diagram_analyzer import DiagramAnalyzer
3551b80… noreply 154 from video_processor.providers.manager import ProviderManager
3551b80… noreply 155
3551b80… noreply 156 analyzer = DiagramAnalyzer(
3551b80… noreply 157 provider_manager=ProviderManager(),
3551b80… noreply 158 confidence_threshold=0.3,
3551b80… noreply 159 )
3551b80… noreply 160
3551b80… noreply 161 frame_paths = list(Path("output/frames").glob("*.jpg"))
3551b80… noreply 162 diagrams, captures = analyzer.process_frames(
3551b80… noreply 163 frame_paths,
3551b80… noreply 164 diagrams_dir=Path("output/diagrams"),
3551b80… noreply 165 captures_dir=Path("output/captures"),
3551b80… noreply 166 )
3551b80… noreply 167
3551b80… noreply 168 print(f"Found {len(diagrams)} diagrams, {len(captures)} screengrabs")
3551b80… noreply 169 for d in diagrams:
3551b80… noreply 170 print(f" [{d.diagram_type.value}] {d.description}")
3551b80… noreply 171 ```
3551b80… noreply 172
3551b80… noreply 173 ---
3551b80… noreply 174
3551b80… noreply 175 ## ContentAnalyzer
3551b80… noreply 176
3551b80… noreply 177 ```python
3551b80… noreply 178 from video_processor.analyzers.content_analyzer import ContentAnalyzer
3551b80… noreply 179 ```
3551b80… noreply 180
3551b80… noreply 181 Cross-references transcript and diagram entities for richer knowledge extraction. Merges entities found in different sources and enriches key points with diagram links.
3551b80… noreply 182
3551b80… noreply 183 ### Constructor
3551b80… noreply 184
3551b80… noreply 185 ```python
3551b80… noreply 186 def __init__(self, provider_manager: Optional[ProviderManager] = None)
3551b80… noreply 187 ```
3551b80… noreply 188
3551b80… noreply 189 | Parameter | Type | Default | Description |
3551b80… noreply 190 |---|---|---|---|
3551b80… noreply 191 | `provider_manager` | `Optional[ProviderManager]` | `None` | Required for LLM-based fuzzy matching |
3551b80… noreply 192
3551b80… noreply 193 ### cross_reference()
3551b80… noreply 194
3551b80… noreply 195 ```python
3551b80… noreply 196 def cross_reference(
3551b80… noreply 197 self,
3551b80… noreply 198 transcript_entities: List[Entity],
3551b80… noreply 199 diagram_entities: List[Entity],
3551b80… noreply 200 ) -> List[Entity]
3551b80… noreply 201 ```
3551b80… noreply 202
3551b80… noreply 203 Merge entities from transcripts and diagrams into a unified list with source attribution.
3551b80… noreply 204
3551b80… noreply 205 **Merge strategy:**
3551b80… noreply 206
3551b80… noreply 207 1. Index all transcript entities by lowercase name, marked with `source="transcript"`
3551b80… noreply 208 2. Merge diagram entities: if a name matches, set `source="both"` and combine descriptions/occurrences; otherwise add as `source="diagram"`
3551b80… noreply 209 3. If a `ProviderManager` is available, use LLM fuzzy matching to find additional matches among unmatched entities (e.g., "PostgreSQL" from transcript matching "Postgres" from diagram)
3551b80… noreply 210
3551b80… noreply 211 **Parameters:**
3551b80… noreply 212
3551b80… noreply 213 | Parameter | Type | Description |
3551b80… noreply 214 |---|---|---|
3551b80… noreply 215 | `transcript_entities` | `List[Entity]` | Entities extracted from transcript |
3551b80… noreply 216 | `diagram_entities` | `List[Entity]` | Entities extracted from diagrams |
3551b80… noreply 217
3551b80… noreply 218 **Returns:** `List[Entity]` -- merged entity list with `source` attribution.
3551b80… noreply 219
3551b80… noreply 220 ```python
3551b80… noreply 221 from video_processor.analyzers.content_analyzer import ContentAnalyzer
3551b80… noreply 222 from video_processor.models import Entity
3551b80… noreply 223
3551b80… noreply 224 analyzer = ContentAnalyzer(provider_manager=pm)
3551b80… noreply 225
3551b80… noreply 226 transcript_entities = [
3551b80… noreply 227 Entity(name="PostgreSQL", type="technology"),
3551b80… noreply 228 Entity(name="Alice", type="person"),
3551b80… noreply 229 ]
3551b80… noreply 230 diagram_entities = [
3551b80… noreply 231 Entity(name="Postgres", type="technology"),
3551b80… noreply 232 Entity(name="Redis", type="technology"),
3551b80… noreply 233 ]
3551b80… noreply 234
3551b80… noreply 235 merged = analyzer.cross_reference(transcript_entities, diagram_entities)
3551b80… noreply 236 # "PostgreSQL" and "Postgres" may be fuzzy-matched and merged
3551b80… noreply 237 ```
3551b80… noreply 238
3551b80… noreply 239 ### enrich_key_points()
3551b80… noreply 240
3551b80… noreply 241 ```python
3551b80… noreply 242 def enrich_key_points(
3551b80… noreply 243 self,
3551b80… noreply 244 key_points: List[KeyPoint],
3551b80… noreply 245 diagrams: list,
3551b80… noreply 246 transcript_text: str,
3551b80… noreply 247 ) -> List[KeyPoint]
3551b80… noreply 248 ```
3551b80… noreply 249
3551b80… noreply 250 Link key points to relevant diagrams by entity overlap. Examines word overlap between key point text and diagram elements/text content.
3551b80… noreply 251
3551b80… noreply 252 **Parameters:**
3551b80… noreply 253
3551b80… noreply 254 | Parameter | Type | Description |
3551b80… noreply 255 |---|---|---|
3551b80… noreply 256 | `key_points` | `List[KeyPoint]` | Key points to enrich |
3551b80… noreply 257 | `diagrams` | `list` | List of `DiagramResult` objects or dicts |
3551b80… noreply 258 | `transcript_text` | `str` | Full transcript text (reserved for future use) |
3551b80… noreply 259
3551b80… noreply 260 **Returns:** `List[KeyPoint]` -- key points with `related_diagrams` indices populated.
3551b80… noreply 261
3551b80… noreply 262 A key point is linked to a diagram when they share 2 or more words (excluding short words) between the key point text/details and the diagram's elements/text content.
3551b80… noreply 263
3551b80… noreply 264 ---
3551b80… noreply 265
3551b80… noreply 266 ## ActionDetector
3551b80… noreply 267
3551b80… noreply 268 ```python
3551b80… noreply 269 from video_processor.analyzers.action_detector import ActionDetector
3551b80… noreply 270 ```
3551b80… noreply 271
3551b80… noreply 272 Detects action items from transcripts and diagram content using LLM extraction with a regex pattern fallback.
3551b80… noreply 273
3551b80… noreply 274 ### Constructor
3551b80… noreply 275
3551b80… noreply 276 ```python
3551b80… noreply 277 def __init__(self, provider_manager: Optional[ProviderManager] = None)
3551b80… noreply 278 ```
3551b80… noreply 279
3551b80… noreply 280 | Parameter | Type | Default | Description |
3551b80… noreply 281 |---|---|---|---|
3551b80… noreply 282 | `provider_manager` | `Optional[ProviderManager]` | `None` | Required for LLM-based extraction |
3551b80… noreply 283
3551b80… noreply 284 ### detect_from_transcript()
3551b80… noreply 285
3551b80… noreply 286 ```python
3551b80… noreply 287 def detect_from_transcript(
3551b80… noreply 288 self,
3551b80… noreply 289 text: str,
3551b80… noreply 290 segments: Optional[List[TranscriptSegment]] = None,
3551b80… noreply 291 ) -> List[ActionItem]
3551b80… noreply 292 ```
3551b80… noreply 293
3551b80… noreply 294 Detect action items from transcript text.
3551b80… noreply 295
3551b80… noreply 296 **Parameters:**
3551b80… noreply 297
3551b80… noreply 298 | Parameter | Type | Default | Description |
3551b80… noreply 299 |---|---|---|---|
3551b80… noreply 300 | `text` | `str` | *required* | Transcript text to analyze |
3551b80… noreply 301 | `segments` | `Optional[List[TranscriptSegment]]` | `None` | Transcript segments for timestamp attachment |
3551b80… noreply 302
3551b80… noreply 303 **Returns:** `List[ActionItem]` -- detected action items with `source="transcript"`.
3551b80… noreply 304
3551b80… noreply 305 **Extraction modes:**
3551b80… noreply 306
3551b80… noreply 307 - **LLM mode** (when `provider_manager` is set): Sends the transcript to the LLM with a structured extraction prompt. Extracts action, assignee, deadline, priority, and context.
3551b80… noreply 308 - **Pattern mode** (fallback): Matches sentences against regex patterns for action-oriented language.
3551b80… noreply 309
3551b80… noreply 310 **Pattern matching** detects sentences containing:
3551b80… noreply 311
3551b80… noreply 312 - "need/needs to", "should/must/shall"
3551b80… noreply 313 - "will/going to", "action item/todo/follow-up"
3551b80… noreply 314 - "assigned to/responsible for", "deadline/due by"
3551b80… noreply 315 - "let's/let us", "make sure/ensure"
3551b80… noreply 316 - "can you/could you/please"
3551b80… noreply 317
3551b80… noreply 318 **Timestamp attachment:** When `segments` are provided, each action item is matched to the most relevant transcript segment (by word overlap, minimum 3 matching words), and a timestamp is added to `context`.
3551b80… noreply 319
3551b80… noreply 320 ### detect_from_diagrams()
3551b80… noreply 321
3551b80… noreply 322 ```python
3551b80… noreply 323 def detect_from_diagrams(self, diagrams: list) -> List[ActionItem]
3551b80… noreply 324 ```
3551b80… noreply 325
3551b80… noreply 326 Extract action items from diagram text content and elements. Processes each diagram's combined text using either LLM or pattern extraction.
3551b80… noreply 327
3551b80… noreply 328 **Parameters:**
3551b80… noreply 329
3551b80… noreply 330 | Parameter | Type | Description |
3551b80… noreply 331 |---|---|---|
3551b80… noreply 332 | `diagrams` | `list` | List of `DiagramResult` objects or dicts |
3551b80… noreply 333
3551b80… noreply 334 **Returns:** `List[ActionItem]` -- action items with `source="diagram"`.
3551b80… noreply 335
3551b80… noreply 336 ### merge_action_items()
3551b80… noreply 337
3551b80… noreply 338 ```python
3551b80… noreply 339 def merge_action_items(
3551b80… noreply 340 self,
3551b80… noreply 341 transcript_items: List[ActionItem],
3551b80… noreply 342 diagram_items: List[ActionItem],
3551b80… noreply 343 ) -> List[ActionItem]
3551b80… noreply 344 ```
3551b80… noreply 345
3551b80… noreply 346 Merge action items from multiple sources, deduplicating by action text (case-insensitive, whitespace-normalized).
3551b80… noreply 347
3551b80… noreply 348 **Returns:** `List[ActionItem]` -- deduplicated merged list.
3551b80… noreply 349
3551b80… noreply 350 ### Usage example
3551b80… noreply 351
3551b80… noreply 352 ```python
3551b80… noreply 353 from video_processor.analyzers.action_detector import ActionDetector
3551b80… noreply 354 from video_processor.providers.manager import ProviderManager
3551b80… noreply 355
3551b80… noreply 356 detector = ActionDetector(provider_manager=ProviderManager())
3551b80… noreply 357
3551b80… noreply 358 # From transcript
3551b80… noreply 359 transcript_items = detector.detect_from_transcript(
3551b80… noreply 360 text="Alice needs to update the API docs by Friday. "
3551b80… noreply 361 "Bob should review the PR before merging.",
3551b80… noreply 362 segments=transcript_segments,
3551b80… noreply 363 )
3551b80… noreply 364
3551b80… noreply 365 # From diagrams
3551b80… noreply 366 diagram_items = detector.detect_from_diagrams(diagram_results)
3551b80… noreply 367
3551b80… noreply 368 # Merge and deduplicate
3551b80… noreply 369 all_items = detector.merge_action_items(transcript_items, diagram_items)
3551b80… noreply 370
3551b80… noreply 371 for item in all_items:
3551b80… noreply 372 print(f"[{item.priority or 'unset'}] {item.action}")
3551b80… noreply 373 if item.assignee:
3551b80… noreply 374 print(f" Assignee: {item.assignee}")
3551b80… noreply 375 if item.deadline:
3551b80… noreply 376 print(f" Deadline: {item.deadline}")
3551b80… noreply 377 ```
3551b80… noreply 378
3551b80… noreply 379 ### Pattern fallback (no LLM)
3551b80… noreply 380
3551b80… noreply 381 ```python
3551b80… noreply 382 # Works without any API keys
3551b80… noreply 383 detector = ActionDetector() # No provider_manager
3551b80… noreply 384 items = detector.detect_from_transcript(
3551b80… noreply 385 "We need to finalize the database schema. "
3551b80… noreply 386 "Please update the deployment scripts."
3551b80… noreply 387 )
3551b80… noreply 388 # Returns ActionItems matched by regex patterns
3551b80… noreply 389 ```

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button