PlanOpticon

planopticon / docs / api / analyzers.md

Source Blame History 389 lines

f0106a3…	leo	1	# Analyzers API Reference
f0106a3…	leo	2
f0106a3…	leo	3	::: video_processor.analyzers.diagram_analyzer
f0106a3…	leo	4
f0106a3…	leo	5	::: video_processor.analyzers.content_analyzer
f0106a3…	leo	6
f0106a3…	leo	7	::: video_processor.analyzers.action_detector
3551b80…	noreply	8
3551b80…	noreply	9	---
3551b80…	noreply	10
3551b80…	noreply	11	## Overview
3551b80…	noreply	12
3551b80…	noreply	13	The analyzers module contains the core content extraction logic for PlanOpticon. These analyzers process video frames and transcripts to extract structured knowledge: diagrams, key points, action items, and cross-referenced entities.
3551b80…	noreply	14
3551b80…	noreply	15	All analyzers accept an optional `ProviderManager` instance. When provided, they use LLM capabilities for richer extraction. Without one, they fall back to heuristic/pattern-based methods where possible.
3551b80…	noreply	16
3551b80…	noreply	17	---
3551b80…	noreply	18
3551b80…	noreply	19	## DiagramAnalyzer
3551b80…	noreply	20
3551b80…	noreply	21	```python
3551b80…	noreply	22	from video_processor.analyzers.diagram_analyzer import DiagramAnalyzer
3551b80…	noreply	23	```
3551b80…	noreply	24
3551b80…	noreply	25	Vision model-based diagram detection and analysis. Classifies video frames as diagrams, slides, screenshots, or other content, then performs full extraction on high-confidence frames.
3551b80…	noreply	26
3551b80…	noreply	27	### Constructor
3551b80…	noreply	28
3551b80…	noreply	29	```python
3551b80…	noreply	30	def __init__(
3551b80…	noreply	31	self,
3551b80…	noreply	32	provider_manager: Optional[ProviderManager] = None,
3551b80…	noreply	33	confidence_threshold: float = 0.3,
3551b80…	noreply	34	)
3551b80…	noreply	35	```
3551b80…	noreply	36
3551b80…	noreply	37	\| Parameter \| Type \| Default \| Description \|
3551b80…	noreply	38	\|---\|---\|---\|---\|
3551b80…	noreply	39	\| `provider_manager` \| `Optional[ProviderManager]` \| `None` \| LLM provider (creates a default if not provided) \|
3551b80…	noreply	40	\| `confidence_threshold` \| `float` \| `0.3` \| Minimum confidence to process a frame at all \|
3551b80…	noreply	41
3551b80…	noreply	42	### classify_frame()
3551b80…	noreply	43
3551b80…	noreply	44	```python
3551b80…	noreply	45	def classify_frame(self, image_path: Union[str, Path]) -> dict
3551b80…	noreply	46	```
3551b80…	noreply	47
3551b80…	noreply	48	Classify a single frame using a vision model. Determines whether the frame contains a diagram, slide, or other visual content worth extracting.
3551b80…	noreply	49
3551b80…	noreply	50	Parameters:
3551b80…	noreply	51
3551b80…	noreply	52	\| Parameter \| Type \| Description \|
3551b80…	noreply	53	\|---\|---\|---\|
3551b80…	noreply	54	\| `image_path` \| `Union[str, Path]` \| Path to the frame image file \|
3551b80…	noreply	55
3551b80…	noreply	56	Returns: `dict` with the following keys:
3551b80…	noreply	57
3551b80…	noreply	58	\| Key \| Type \| Description \|
3551b80…	noreply	59	\|---\|---\|---\|
3551b80…	noreply	60	\| `is_diagram` \| `bool` \| Whether the frame contains extractable content \|
3551b80…	noreply	61	\| `diagram_type` \| `str` \| One of: `flowchart`, `sequence`, `architecture`, `whiteboard`, `chart`, `table`, `slide`, `screenshot`, `unknown` \|
3551b80…	noreply	62	\| `confidence` \| `float` \| Detection confidence from 0.0 to 1.0 \|
3551b80…	noreply	63	\| `content_type` \| `str` \| Content category: `slide`, `diagram`, `document`, `screen_share`, `whiteboard`, `chart`, `person`, `other` \|
3551b80…	noreply	64	\| `brief_description` \| `str` \| One-sentence description of the frame content \|
3551b80…	noreply	65
3551b80…	noreply	66	Important: Frames showing people, webcam feeds, or video conference participant views return `confidence: 0.0`. The classifier is tuned to detect only shared/presented content.
3551b80…	noreply	67
3551b80…	noreply	68	```python
3551b80…	noreply	69	analyzer = DiagramAnalyzer()
3551b80…	noreply	70	result = analyzer.classify_frame("/path/to/frame_042.jpg")
3551b80…	noreply	71	if result["confidence"] >= 0.7:
3551b80…	noreply	72	print(f"Diagram detected: {result['diagram_type']}")
3551b80…	noreply	73	```
3551b80…	noreply	74
3551b80…	noreply	75	### analyze_diagram_single_pass()
3551b80…	noreply	76
3551b80…	noreply	77	```python
3551b80…	noreply	78	def analyze_diagram_single_pass(self, image_path: Union[str, Path]) -> dict
3551b80…	noreply	79	```
3551b80…	noreply	80
3551b80…	noreply	81	Full single-pass diagram analysis. Extracts description, text content, elements, relationships, Mermaid syntax, and chart data in a single LLM call.
3551b80…	noreply	82
3551b80…	noreply	83	Returns: `dict` with the following keys:
3551b80…	noreply	84
3551b80…	noreply	85	\| Key \| Type \| Description \|
3551b80…	noreply	86	\|---\|---\|---\|
3551b80…	noreply	87	\| `diagram_type` \| `str` \| Diagram classification \|
3551b80…	noreply	88	\| `description` \| `str` \| Detailed description of the visual content \|
3551b80…	noreply	89	\| `text_content` \| `str` \| All visible text, preserving structure \|
3551b80…	noreply	90	\| `elements` \| `list[str]` \| Identified elements/components \|
3551b80…	noreply	91	\| `relationships` \| `list[str]` \| Relationships in `"A -> B: label"` format \|
3551b80…	noreply	92	\| `mermaid` \| `str` \| Valid Mermaid diagram syntax \|
3551b80…	noreply	93	\| `chart_data` \| `dict \\| None` \| Chart data with `labels`, `values`, `chart_type` (only for data charts) \|
3551b80…	noreply	94
3551b80…	noreply	95	Returns an empty `dict` on failure.
3551b80…	noreply	96
3551b80…	noreply	97	### caption_frame()
3551b80…	noreply	98
3551b80…	noreply	99	```python
3551b80…	noreply	100	def caption_frame(self, image_path: Union[str, Path]) -> str
3551b80…	noreply	101	```
3551b80…	noreply	102
3551b80…	noreply	103	Get a brief 1-2 sentence caption for a frame. Used as a fallback when full diagram analysis is not warranted.
3551b80…	noreply	104
3551b80…	noreply	105	Returns: `str` -- a brief description of the frame content.
3551b80…	noreply	106
3551b80…	noreply	107	### process_frames()
3551b80…	noreply	108
3551b80…	noreply	109	```python
3551b80…	noreply	110	def process_frames(
3551b80…	noreply	111	self,
3551b80…	noreply	112	frame_paths: List[Union[str, Path]],
3551b80…	noreply	113	diagrams_dir: Optional[Path] = None,
3551b80…	noreply	114	captures_dir: Optional[Path] = None,
3551b80…	noreply	115	) -> Tuple[List[DiagramResult], List[ScreenCapture]]
3551b80…	noreply	116	```
3551b80…	noreply	117
3551b80…	noreply	118	Process a batch of extracted video frames through the full classification and analysis pipeline.
3551b80…	noreply	119
3551b80…	noreply	120	Parameters:
3551b80…	noreply	121
3551b80…	noreply	122	\| Parameter \| Type \| Default \| Description \|
3551b80…	noreply	123	\|---\|---\|---\|---\|
3551b80…	noreply	124	\| `frame_paths` \| `List[Union[str, Path]]` \| required \| Paths to frame images \|
3551b80…	noreply	125	\| `diagrams_dir` \| `Optional[Path]` \| `None` \| Output directory for diagram files (images, mermaid, JSON) \|
3551b80…	noreply	126	\| `captures_dir` \| `Optional[Path]` \| `None` \| Output directory for screengrab fallback files \|
3551b80…	noreply	127
3551b80…	noreply	128	Returns: `Tuple[List[DiagramResult], List[ScreenCapture]]`
3551b80…	noreply	129
3551b80…	noreply	130	Confidence thresholds:
3551b80…	noreply	131
3551b80…	noreply	132	\| Confidence Range \| Action \|
3551b80…	noreply	133	\|---\|---\|
3551b80…	noreply	134	\| >= 0.7 \| Full diagram analysis -- extracts elements, relationships, Mermaid syntax \|
3551b80…	noreply	135	\| 0.3 to 0.7 \| Screengrab fallback -- saves frame with a brief caption \|
3551b80…	noreply	136	\| < 0.3 \| Skipped entirely \|
3551b80…	noreply	137
3551b80…	noreply	138	Output files (when directories are provided):
3551b80…	noreply	139
3551b80…	noreply	140	For diagrams (`diagrams_dir`):
3551b80…	noreply	141
3551b80…	noreply	142	- `diagram_N.jpg` -- original frame image
3551b80…	noreply	143	- `diagram_N.mermaid` -- Mermaid source (if generated)
3551b80…	noreply	144	- `diagram_N.json` -- full DiagramResult as JSON
3551b80…	noreply	145
3551b80…	noreply	146	For screen captures (`captures_dir`):
3551b80…	noreply	147
3551b80…	noreply	148	- `capture_N.jpg` -- original frame image
3551b80…	noreply	149	- `capture_N.json` -- ScreenCapture metadata as JSON
3551b80…	noreply	150
3551b80…	noreply	151	```python
3551b80…	noreply	152	from pathlib import Path
3551b80…	noreply	153	from video_processor.analyzers.diagram_analyzer import DiagramAnalyzer
3551b80…	noreply	154	from video_processor.providers.manager import ProviderManager
3551b80…	noreply	155
3551b80…	noreply	156	analyzer = DiagramAnalyzer(
3551b80…	noreply	157	provider_manager=ProviderManager(),
3551b80…	noreply	158	confidence_threshold=0.3,
3551b80…	noreply	159	)
3551b80…	noreply	160
3551b80…	noreply	161	frame_paths = list(Path("output/frames").glob("*.jpg"))
3551b80…	noreply	162	diagrams, captures = analyzer.process_frames(
3551b80…	noreply	163	frame_paths,
3551b80…	noreply	164	diagrams_dir=Path("output/diagrams"),
3551b80…	noreply	165	captures_dir=Path("output/captures"),
3551b80…	noreply	166	)
3551b80…	noreply	167
3551b80…	noreply	168	print(f"Found {len(diagrams)} diagrams, {len(captures)} screengrabs")
3551b80…	noreply	169	for d in diagrams:
3551b80…	noreply	170	print(f" [{d.diagram_type.value}] {d.description}")
3551b80…	noreply	171	```
3551b80…	noreply	172
3551b80…	noreply	173	---
3551b80…	noreply	174
3551b80…	noreply	175	## ContentAnalyzer
3551b80…	noreply	176
3551b80…	noreply	177	```python
3551b80…	noreply	178	from video_processor.analyzers.content_analyzer import ContentAnalyzer
3551b80…	noreply	179	```
3551b80…	noreply	180
3551b80…	noreply	181	Cross-references transcript and diagram entities for richer knowledge extraction. Merges entities found in different sources and enriches key points with diagram links.
3551b80…	noreply	182
3551b80…	noreply	183	### Constructor
3551b80…	noreply	184
3551b80…	noreply	185	```python
3551b80…	noreply	186	def __init__(self, provider_manager: Optional[ProviderManager] = None)
3551b80…	noreply	187	```
3551b80…	noreply	188
3551b80…	noreply	189	\| Parameter \| Type \| Default \| Description \|
3551b80…	noreply	190	\|---\|---\|---\|---\|
3551b80…	noreply	191	\| `provider_manager` \| `Optional[ProviderManager]` \| `None` \| Required for LLM-based fuzzy matching \|
3551b80…	noreply	192
3551b80…	noreply	193	### cross_reference()
3551b80…	noreply	194
3551b80…	noreply	195	```python
3551b80…	noreply	196	def cross_reference(
3551b80…	noreply	197	self,
3551b80…	noreply	198	transcript_entities: List[Entity],
3551b80…	noreply	199	diagram_entities: List[Entity],
3551b80…	noreply	200	) -> List[Entity]
3551b80…	noreply	201	```
3551b80…	noreply	202
3551b80…	noreply	203	Merge entities from transcripts and diagrams into a unified list with source attribution.
3551b80…	noreply	204
3551b80…	noreply	205	Merge strategy:
3551b80…	noreply	206
3551b80…	noreply	207	1. Index all transcript entities by lowercase name, marked with `source="transcript"`
3551b80…	noreply	208	2. Merge diagram entities: if a name matches, set `source="both"` and combine descriptions/occurrences; otherwise add as `source="diagram"`
3551b80…	noreply	209	3. If a `ProviderManager` is available, use LLM fuzzy matching to find additional matches among unmatched entities (e.g., "PostgreSQL" from transcript matching "Postgres" from diagram)
3551b80…	noreply	210
3551b80…	noreply	211	Parameters:
3551b80…	noreply	212
3551b80…	noreply	213	\| Parameter \| Type \| Description \|
3551b80…	noreply	214	\|---\|---\|---\|
3551b80…	noreply	215	\| `transcript_entities` \| `List[Entity]` \| Entities extracted from transcript \|
3551b80…	noreply	216	\| `diagram_entities` \| `List[Entity]` \| Entities extracted from diagrams \|
3551b80…	noreply	217
3551b80…	noreply	218	Returns: `List[Entity]` -- merged entity list with `source` attribution.
3551b80…	noreply	219
3551b80…	noreply	220	```python
3551b80…	noreply	221	from video_processor.analyzers.content_analyzer import ContentAnalyzer
3551b80…	noreply	222	from video_processor.models import Entity
3551b80…	noreply	223
3551b80…	noreply	224	analyzer = ContentAnalyzer(provider_manager=pm)
3551b80…	noreply	225
3551b80…	noreply	226	transcript_entities = [
3551b80…	noreply	227	Entity(name="PostgreSQL", type="technology"),
3551b80…	noreply	228	Entity(name="Alice", type="person"),
3551b80…	noreply	229	]
3551b80…	noreply	230	diagram_entities = [
3551b80…	noreply	231	Entity(name="Postgres", type="technology"),
3551b80…	noreply	232	Entity(name="Redis", type="technology"),
3551b80…	noreply	233	]
3551b80…	noreply	234
3551b80…	noreply	235	merged = analyzer.cross_reference(transcript_entities, diagram_entities)
3551b80…	noreply	236	# "PostgreSQL" and "Postgres" may be fuzzy-matched and merged
3551b80…	noreply	237	```
3551b80…	noreply	238
3551b80…	noreply	239	### enrich_key_points()
3551b80…	noreply	240
3551b80…	noreply	241	```python
3551b80…	noreply	242	def enrich_key_points(
3551b80…	noreply	243	self,
3551b80…	noreply	244	key_points: List[KeyPoint],
3551b80…	noreply	245	diagrams: list,
3551b80…	noreply	246	transcript_text: str,
3551b80…	noreply	247	) -> List[KeyPoint]
3551b80…	noreply	248	```
3551b80…	noreply	249
3551b80…	noreply	250	Link key points to relevant diagrams by entity overlap. Examines word overlap between key point text and diagram elements/text content.
3551b80…	noreply	251
3551b80…	noreply	252	Parameters:
3551b80…	noreply	253
3551b80…	noreply	254	\| Parameter \| Type \| Description \|
3551b80…	noreply	255	\|---\|---\|---\|
3551b80…	noreply	256	\| `key_points` \| `List[KeyPoint]` \| Key points to enrich \|
3551b80…	noreply	257	\| `diagrams` \| `list` \| List of `DiagramResult` objects or dicts \|
3551b80…	noreply	258	\| `transcript_text` \| `str` \| Full transcript text (reserved for future use) \|
3551b80…	noreply	259
3551b80…	noreply	260	Returns: `List[KeyPoint]` -- key points with `related_diagrams` indices populated.
3551b80…	noreply	261
3551b80…	noreply	262	A key point is linked to a diagram when they share 2 or more words (excluding short words) between the key point text/details and the diagram's elements/text content.
3551b80…	noreply	263
3551b80…	noreply	264	---
3551b80…	noreply	265
3551b80…	noreply	266	## ActionDetector
3551b80…	noreply	267
3551b80…	noreply	268	```python
3551b80…	noreply	269	from video_processor.analyzers.action_detector import ActionDetector
3551b80…	noreply	270	```
3551b80…	noreply	271
3551b80…	noreply	272	Detects action items from transcripts and diagram content using LLM extraction with a regex pattern fallback.
3551b80…	noreply	273
3551b80…	noreply	274	### Constructor
3551b80…	noreply	275
3551b80…	noreply	276	```python
3551b80…	noreply	277	def __init__(self, provider_manager: Optional[ProviderManager] = None)
3551b80…	noreply	278	```
3551b80…	noreply	279
3551b80…	noreply	280	\| Parameter \| Type \| Default \| Description \|
3551b80…	noreply	281	\|---\|---\|---\|---\|
3551b80…	noreply	282	\| `provider_manager` \| `Optional[ProviderManager]` \| `None` \| Required for LLM-based extraction \|
3551b80…	noreply	283
3551b80…	noreply	284	### detect_from_transcript()
3551b80…	noreply	285
3551b80…	noreply	286	```python
3551b80…	noreply	287	def detect_from_transcript(
3551b80…	noreply	288	self,
3551b80…	noreply	289	text: str,
3551b80…	noreply	290	segments: Optional[List[TranscriptSegment]] = None,
3551b80…	noreply	291	) -> List[ActionItem]
3551b80…	noreply	292	```
3551b80…	noreply	293
3551b80…	noreply	294	Detect action items from transcript text.
3551b80…	noreply	295
3551b80…	noreply	296	Parameters:
3551b80…	noreply	297
3551b80…	noreply	298	\| Parameter \| Type \| Default \| Description \|
3551b80…	noreply	299	\|---\|---\|---\|---\|
3551b80…	noreply	300	\| `text` \| `str` \| required \| Transcript text to analyze \|
3551b80…	noreply	301	\| `segments` \| `Optional[List[TranscriptSegment]]` \| `None` \| Transcript segments for timestamp attachment \|
3551b80…	noreply	302
3551b80…	noreply	303	Returns: `List[ActionItem]` -- detected action items with `source="transcript"`.
3551b80…	noreply	304
3551b80…	noreply	305	Extraction modes:
3551b80…	noreply	306
3551b80…	noreply	307	- LLM mode (when `provider_manager` is set): Sends the transcript to the LLM with a structured extraction prompt. Extracts action, assignee, deadline, priority, and context.
3551b80…	noreply	308	- Pattern mode (fallback): Matches sentences against regex patterns for action-oriented language.
3551b80…	noreply	309
3551b80…	noreply	310	Pattern matching detects sentences containing:
3551b80…	noreply	311
3551b80…	noreply	312	- "need/needs to", "should/must/shall"
3551b80…	noreply	313	- "will/going to", "action item/todo/follow-up"
3551b80…	noreply	314	- "assigned to/responsible for", "deadline/due by"
3551b80…	noreply	315	- "let's/let us", "make sure/ensure"
3551b80…	noreply	316	- "can you/could you/please"
3551b80…	noreply	317
3551b80…	noreply	318	Timestamp attachment: When `segments` are provided, each action item is matched to the most relevant transcript segment (by word overlap, minimum 3 matching words), and a timestamp is added to `context`.
3551b80…	noreply	319
3551b80…	noreply	320	### detect_from_diagrams()
3551b80…	noreply	321
3551b80…	noreply	322	```python
3551b80…	noreply	323	def detect_from_diagrams(self, diagrams: list) -> List[ActionItem]
3551b80…	noreply	324	```
3551b80…	noreply	325
3551b80…	noreply	326	Extract action items from diagram text content and elements. Processes each diagram's combined text using either LLM or pattern extraction.
3551b80…	noreply	327
3551b80…	noreply	328	Parameters:
3551b80…	noreply	329
3551b80…	noreply	330	\| Parameter \| Type \| Description \|
3551b80…	noreply	331	\|---\|---\|---\|
3551b80…	noreply	332	\| `diagrams` \| `list` \| List of `DiagramResult` objects or dicts \|
3551b80…	noreply	333
3551b80…	noreply	334	Returns: `List[ActionItem]` -- action items with `source="diagram"`.
3551b80…	noreply	335
3551b80…	noreply	336	### merge_action_items()
3551b80…	noreply	337
3551b80…	noreply	338	```python
3551b80…	noreply	339	def merge_action_items(
3551b80…	noreply	340	self,
3551b80…	noreply	341	transcript_items: List[ActionItem],
3551b80…	noreply	342	diagram_items: List[ActionItem],
3551b80…	noreply	343	) -> List[ActionItem]
3551b80…	noreply	344	```
3551b80…	noreply	345
3551b80…	noreply	346	Merge action items from multiple sources, deduplicating by action text (case-insensitive, whitespace-normalized).
3551b80…	noreply	347
3551b80…	noreply	348	Returns: `List[ActionItem]` -- deduplicated merged list.
3551b80…	noreply	349
3551b80…	noreply	350	### Usage example
3551b80…	noreply	351
3551b80…	noreply	352	```python
3551b80…	noreply	353	from video_processor.analyzers.action_detector import ActionDetector
3551b80…	noreply	354	from video_processor.providers.manager import ProviderManager
3551b80…	noreply	355
3551b80…	noreply	356	detector = ActionDetector(provider_manager=ProviderManager())
3551b80…	noreply	357
3551b80…	noreply	358	# From transcript
3551b80…	noreply	359	transcript_items = detector.detect_from_transcript(
3551b80…	noreply	360	text="Alice needs to update the API docs by Friday. "
3551b80…	noreply	361	"Bob should review the PR before merging.",
3551b80…	noreply	362	segments=transcript_segments,
3551b80…	noreply	363	)
3551b80…	noreply	364
3551b80…	noreply	365	# From diagrams
3551b80…	noreply	366	diagram_items = detector.detect_from_diagrams(diagram_results)
3551b80…	noreply	367
3551b80…	noreply	368	# Merge and deduplicate
3551b80…	noreply	369	all_items = detector.merge_action_items(transcript_items, diagram_items)
3551b80…	noreply	370
3551b80…	noreply	371	for item in all_items:
3551b80…	noreply	372	print(f"[{item.priority or 'unset'}] {item.action}")
3551b80…	noreply	373	if item.assignee:
3551b80…	noreply	374	print(f" Assignee: {item.assignee}")
3551b80…	noreply	375	if item.deadline:
3551b80…	noreply	376	print(f" Deadline: {item.deadline}")
3551b80…	noreply	377	```
3551b80…	noreply	378
3551b80…	noreply	379	### Pattern fallback (no LLM)
3551b80…	noreply	380
3551b80…	noreply	381	```python
3551b80…	noreply	382	# Works without any API keys
3551b80…	noreply	383	detector = ActionDetector() # No provider_manager
3551b80…	noreply	384	items = detector.detect_from_transcript(
3551b80…	noreply	385	"We need to finalize the database schema. "
3551b80…	noreply	386	"Please update the deployment scripts."
3551b80…	noreply	387	)
3551b80…	noreply	388	# Returns ActionItems matched by regex patterns
3551b80…	noreply	389	```

PlanOpticon

Keyboard Shortcuts