PlanOpticon

docs: comprehensive v0.4.0 documentation — 27 pages, use cases, FAQ New pages (11): - guide/companion.md — Interactive Companion REPL - guide/planning-agent.md — Planning Agent and 11 skills - guide/knowledge-graphs.md — KG storage, querying, taxonomy, viewer - guide/authentication.md — OAuth setup for 6 services - guide/document-ingestion.md — PDF, Markdown, plaintext ingestion - guide/export.md — 7 markdown doc types, Obsidian, Notion, Wiki, Exchange - api/agent.md — PlanningAgent, AgentContext, Skills API - api/sources.md — BaseSource, 21 source connectors - api/auth.md — AuthConfig, OAuthManager API - use-cases.md — 10 real-world workflows with full commands - faq.md — FAQ and troubleshooting guide Updated pages (10): - guide/output-formats.md — all output formats including SQLite KG, Exchange - guide/single-video.md — taxonomy, --speakers, --output-format, post-analysis - guide/batch.md — fuzzy merge, querying results, incremental processing - architecture/pipeline.md — 5 mermaid diagrams, all pipelines - contributing.md — ruff, ProviderRegistry, skills, processors, exporters - getting-started/configuration.md — full .env example with OAuth walkthroughs - api/models.md — all 17+ Pydantic models documented - api/providers.md — BaseProvider, ProviderRegistry, ProviderManager - api/analyzers.md — DiagramAnalyzer, ContentAnalyzer, ActionDetector - mkdocs.yml — nav updated with all new pages Also fixes check-yaml pre-commit hook to handle mkdocs.yml Python tags.

lmata 2026-03-08 00:17 trunk
Commit 3da1f8f9af3d2ae023942b141853a68b784a7fd986e8a1e92f650e182fe78dbd
--- .pre-commit-config.yaml
+++ .pre-commit-config.yaml
@@ -9,9 +9,10 @@
99
rev: v5.0.0
1010
hooks:
1111
- id: trailing-whitespace
1212
- id: end-of-file-fixer
1313
- id: check-yaml
14
+ args: [--unsafe]
1415
- id: check-added-large-files
1516
args: [--maxkb=500]
1617
- id: check-merge-conflict
1718
- id: detect-private-key
1819
1920
ADDED docs/api/agent.md
--- .pre-commit-config.yaml
+++ .pre-commit-config.yaml
@@ -9,9 +9,10 @@
9 rev: v5.0.0
10 hooks:
11 - id: trailing-whitespace
12 - id: end-of-file-fixer
13 - id: check-yaml
 
14 - id: check-added-large-files
15 args: [--maxkb=500]
16 - id: check-merge-conflict
17 - id: detect-private-key
18
19 DDED docs/api/agent.md
--- .pre-commit-config.yaml
+++ .pre-commit-config.yaml
@@ -9,9 +9,10 @@
9 rev: v5.0.0
10 hooks:
11 - id: trailing-whitespace
12 - id: end-of-file-fixer
13 - id: check-yaml
14 args: [--unsafe]
15 - id: check-added-large-files
16 args: [--maxkb=500]
17 - id: check-merge-conflict
18 - id: detect-private-key
19
20 DDED docs/api/agent.md
--- a/docs/api/agent.md
+++ b/docs/api/agent.md
@@ -0,0 +1,407 @@
1
+# Agent API Reference
2
+
3
+::: video_processor.agent.agent_loop
4
+
5
+::: video_processor.agent.skills.base
6
+
7
+::: video_processor.agent.kb_context
8
+
9
+---
10
+
11
+## Overview
12
+
13
+The agent module implements a planning agent that synthesizes knowledge from processed video content into actionable artifacts such as project plans, PRDs, task breakdowns, and roadmaps. The agent operates on knowledge graphs loaded via `KBContext` and uses a skill-based architecture for extensibility.
14
+
15
+**Key components:**
16
+
17
+- **`PlanningAgent`** -- orchestrates skill selection and execution based on user requests
18
+- **`AgentContext`** -- shared state passed between skills during execution
19
+- **`Skill`** (ABC) -- base class for pluggable agent capabilities
20
+- **`Artifact`** -- output produced by skill execution
21
+- **`KBContext`** -- loads and merges multiple knowledge graph sources
22
+
23
+---
24
+
25
+## PlanningAgent
26
+
27
+```python
28
+from video_processor.agent.agent_loop import PlanningAgent
29
+```
30
+
31
+AI agent that synthesizes knowledge into planning artifacts. Uses an LLM to select which skills to execute for a given request, or falls back to keyword matching when no LLM is available.
32
+
33
+### Constructor
34
+
35
+```python
36
+def __init__(self, context: AgentContext)
37
+```
38
+
39
+| Parameter | Type | Description |
40
+|---|---|---|
41
+| `context` | `AgentContext` | Shared context containing knowledge graph, query engine, and provider |
42
+
43
+### from_kb_paths()
44
+
45
+```python
46
+@classmethod
47
+def from_kb_paths(
48
+ cls,
49
+ kb_paths: List[Path],
50
+ provider_manager=None,
51
+) -> PlanningAgent
52
+```
53
+
54
+Factory method that creates an agent from one or more knowledge base file paths. Handles loading and merging knowledge graphs automatically.
55
+
56
+**Parameters:**
57
+
58
+| Parameter | Type | Default | Description |
59
+|---|---|---|---|
60
+| `kb_paths` | `List[Path]` | *required* | Paths to `.db` or `.json` knowledge graph files, or directories to search |
61
+| `provider_manager` | `ProviderManager` | `None` | LLM provider for agent operations |
62
+
63
+**Returns:** `PlanningAgent` -- configured agent with loaded knowledge base.
64
+
65
+```python
66
+from pathlib import Path
67
+from video_processor.agent.agent_loop import PlanningAgent
68
+from video_processor.providers.manager import ProviderManager
69
+
70
+agent = PlanningAgent.from_kb_paths(
71
+ kb_paths=[Path("results/knowledge_graph.db")],
72
+ provider_manager=ProviderManager(),
73
+)
74
+```
75
+
76
+### execute()
77
+
78
+```python
79
+def execute(self, request: str) -> List[Artifact]
80
+```
81
+
82
+Execute a user request by selecting and running appropriate skills.
83
+
84
+**Process:**
85
+
86
+1. Build a context summary from the knowledge base statistics
87
+2. Format available skills with their descriptions
88
+3. Ask the LLM to select skills and parameters (or use keyword matching as fallback)
89
+4. Execute selected skills in order, accumulating artifacts
90
+
91
+**Parameters:**
92
+
93
+| Parameter | Type | Description |
94
+|---|---|---|
95
+| `request` | `str` | Natural language request (e.g., "Generate a project plan") |
96
+
97
+**Returns:** `List[Artifact]` -- generated artifacts from skill execution.
98
+
99
+**LLM mode:** The LLM receives the knowledge base summary, available skills, and user request, then returns a JSON array of `{"skill": "name", "params": {}}` objects to execute.
100
+
101
+**Keyword fallback:** Without an LLM, skills are matched by splitting the skill name into words and checking if any appear in the request text.
102
+
103
+```python
104
+artifacts = agent.execute("Create a PRD and task breakdown")
105
+for artifact in artifacts:
106
+ print(f"--- {artifact.name} ({artifact.artifact_type}) ---")
107
+ print(artifact.content[:500])
108
+```
109
+
110
+### chat()
111
+
112
+```python
113
+def chat(self, message: str) -> str
114
+```
115
+
116
+Interactive chat mode. Maintains conversation history and provides contextual responses about the loaded knowledge base.
117
+
118
+**Parameters:**
119
+
120
+| Parameter | Type | Description |
121
+|---|---|---|
122
+| `message` | `str` | User message |
123
+
124
+**Returns:** `str` -- assistant response.
125
+
126
+The chat mode provides the LLM with:
127
+
128
+- Knowledge base statistics (entity counts, relationship counts)
129
+- List of previously generated artifacts
130
+- Full conversation history
131
+- Available REPL commands (e.g., `/entities`, `/search`, `/plan`, `/export`)
132
+
133
+**Requires** a configured `provider_manager`. Returns a static error message if no LLM is available.
134
+
135
+```python
136
+response = agent.chat("What technologies were discussed in the meetings?")
137
+print(response)
138
+
139
+response = agent.chat("Which of those have the most dependencies?")
140
+print(response)
141
+```
142
+
143
+---
144
+
145
+## AgentContext
146
+
147
+```python
148
+from video_processor.agent.skills.base import AgentContext
149
+```
150
+
151
+Shared state dataclass passed to all skills during execution. Accumulates artifacts and conversation history across the agent session.
152
+
153
+| Field | Type | Default | Description |
154
+|---|---|---|---|
155
+| `knowledge_graph` | `Any` | `None` | `KnowledgeGraph` instance |
156
+| `query_engine` | `Any` | `None` | `GraphQueryEngine` instance for querying the KG |
157
+| `provider_manager` | `Any` | `None` | `ProviderManager` instance for LLM calls |
158
+| `planning_entities` | `List[Any]` | `[]` | Extracted `PlanningEntity` instances |
159
+| `user_requirements` | `Dict[str, Any]` | `{}` | User-specified requirements and constraints |
160
+| `conversation_history` | `List[Dict[str, str]]` | `[]` | Chat message history (`role`, `content` dicts) |
161
+| `artifacts` | `List[Artifact]` | `[]` | Previously generated artifacts |
162
+| `config` | `Dict[str, Any]` | `{}` | Additional configuration |
163
+
164
+```python
165
+from video_processor.agent.skills.base import AgentContext
166
+
167
+context = AgentContext(
168
+ knowledge_graph=kg,
169
+ query_engine=engine,
170
+ provider_manager=pm,
171
+ config={"output_format": "markdown"},
172
+)
173
+```
174
+
175
+---
176
+
177
+## Skill (ABC)
178
+
179
+```python
180
+from video_processor.agent.skills.base import Skill
181
+```
182
+
183
+Base class for agent skills. Each skill represents a discrete capability that produces an artifact from the agent context.
184
+
185
+**Class attributes:**
186
+
187
+| Attribute | Type | Description |
188
+|---|---|---|
189
+| `name` | `str` | Skill identifier (e.g., `"project_plan"`, `"prd"`) |
190
+| `description` | `str` | Human-readable description shown to the LLM for skill selection |
191
+
192
+### execute()
193
+
194
+```python
195
+@abstractmethod
196
+def execute(self, context: AgentContext, **kwargs) -> Artifact
197
+```
198
+
199
+Execute this skill and return an artifact. Receives the shared agent context and any parameters selected by the LLM planner.
200
+
201
+### can_execute()
202
+
203
+```python
204
+def can_execute(self, context: AgentContext) -> bool
205
+```
206
+
207
+Check if this skill can execute given the current context. The default implementation requires both `knowledge_graph` and `provider_manager` to be set. Override for skills with different requirements.
208
+
209
+**Returns:** `bool`
210
+
211
+### Implementing a custom skill
212
+
213
+```python
214
+from video_processor.agent.skills.base import Skill, Artifact, AgentContext, register_skill
215
+
216
+class SummarySkill(Skill):
217
+ name = "summary"
218
+ description = "Generate a concise summary of the knowledge base"
219
+
220
+ def execute(self, context: AgentContext, **kwargs) -> Artifact:
221
+ stats = context.query_engine.stats()
222
+ prompt = f"Summarize this knowledge base:\n{stats.to_text()}"
223
+ content = context.provider_manager.chat(
224
+ [{"role": "user", "content": prompt}]
225
+ )
226
+ return Artifact(
227
+ name="Knowledge Base Summary",
228
+ content=content,
229
+ artifact_type="document",
230
+ format="markdown",
231
+ )
232
+
233
+ def can_execute(self, context: AgentContext) -> bool:
234
+ return context.query_engine is not None and context.provider_manager is not None
235
+
236
+# Register the skill so the agent can discover it
237
+register_skill(SummarySkill())
238
+```
239
+
240
+---
241
+
242
+## Artifact
243
+
244
+```python
245
+from video_processor.agent.skills.base import Artifact
246
+```
247
+
248
+Dataclass representing the output of a skill execution.
249
+
250
+| Field | Type | Default | Description |
251
+|---|---|---|---|
252
+| `name` | `str` | *required* | Human-readable artifact name |
253
+| `content` | `str` | *required* | Generated content (Markdown, JSON, Mermaid, etc.) |
254
+| `artifact_type` | `str` | *required* | Type: `"project_plan"`, `"prd"`, `"roadmap"`, `"task_list"`, `"document"`, `"issues"` |
255
+| `format` | `str` | `"markdown"` | Content format: `"markdown"`, `"json"`, `"mermaid"` |
256
+| `metadata` | `Dict[str, Any]` | `{}` | Additional metadata |
257
+
258
+---
259
+
260
+## Skill Registry Functions
261
+
262
+### register_skill()
263
+
264
+```python
265
+def register_skill(skill: Skill) -> None
266
+```
267
+
268
+Register a skill instance in the global registry. Skills must be registered before the agent can discover and execute them.
269
+
270
+### get_skill()
271
+
272
+```python
273
+def get_skill(name: str) -> Optional[Skill]
274
+```
275
+
276
+Look up a registered skill by name.
277
+
278
+**Returns:** `Optional[Skill]` -- the skill instance, or `None` if not found.
279
+
280
+### list_skills()
281
+
282
+```python
283
+def list_skills() -> List[Skill]
284
+```
285
+
286
+Return all registered skill instances.
287
+
288
+---
289
+
290
+## KBContext
291
+
292
+```python
293
+from video_processor.agent.kb_context import KBContext
294
+```
295
+
296
+Loads and merges multiple knowledge graph sources into a unified context for agent consumption. Supports both FalkorDB (`.db`) and JSON (`.json`) formats, and can auto-discover graphs in a directory tree.
297
+
298
+### Constructor
299
+
300
+```python
301
+def __init__(self)
302
+```
303
+
304
+Creates an empty context. Use `add_source()` to add knowledge graph paths, then `load()` to initialize.
305
+
306
+### add_source()
307
+
308
+```python
309
+def add_source(self, path) -> None
310
+```
311
+
312
+Add a knowledge graph source.
313
+
314
+**Parameters:**
315
+
316
+| Parameter | Type | Description |
317
+|---|---|---|
318
+| `path` | `str \| Path` | Path to a `.db` file, `.json` file, or directory to search for knowledge graphs |
319
+
320
+If `path` is a directory, it is searched recursively for knowledge graph files using `find_knowledge_graphs()`.
321
+
322
+**Raises:** `FileNotFoundError` if the path does not exist.
323
+
324
+### load()
325
+
326
+```python
327
+def load(self, provider_manager=None) -> KBContext
328
+```
329
+
330
+Load and merge all added sources into a single knowledge graph and query engine.
331
+
332
+**Parameters:**
333
+
334
+| Parameter | Type | Default | Description |
335
+|---|---|---|---|
336
+| `provider_manager` | `ProviderManager` | `None` | LLM provider for the knowledge graph and query engine |
337
+
338
+**Returns:** `KBContext` -- self, for method chaining.
339
+
340
+### Properties
341
+
342
+| Property | Type | Description |
343
+|---|---|---|
344
+| `knowledge_graph` | `KnowledgeGraph` | The merged knowledge graph (raises `RuntimeError` if not loaded) |
345
+| `query_engine` | `GraphQueryEngine` | Query engine for the merged graph (raises `RuntimeError` if not loaded) |
346
+| `sources` | `List[Path]` | List of resolved source paths |
347
+
348
+### summary()
349
+
350
+```python
351
+def summary(self) -> str
352
+```
353
+
354
+Generate a brief text summary of the loaded knowledge base, including entity counts by type and relationship counts.
355
+
356
+**Returns:** `str` -- multi-line summary text.
357
+
358
+### auto_discover()
359
+
360
+```python
361
+@classmethod
362
+def auto_discover(
363
+ cls,
364
+ start_dir: Optional[Path] = None,
365
+ provider_manager=None,
366
+) -> KBContext
367
+```
368
+
369
+Factory method that creates a `KBContext` by auto-discovering knowledge graphs near `start_dir` (defaults to current directory).
370
+
371
+**Returns:** `KBContext` -- loaded context (may have zero sources if none found).
372
+
373
+### Usage examples
374
+
375
+```python
376
+from pathlib import Path
377
+from video_processor.agent.kb_context import KBContext
378
+
379
+# Manual source management
380
+kb = KBContext()
381
+kb.add_source(Path("project_a/knowledge_graph.db"))
382
+kb.add_source(Path("project_b/results/")) # searches directory
383
+kb.load(provider_manager=pm)
384
+
385
+print(kb.summary())
386
+# Knowledge base: 3 source(s)
387
+# Entities: 142
388
+# Relationships: 89
389
+# Entity types:
390
+# technology: 45
391
+# person: 23
392
+# concept: 74
393
+
394
+# Auto-discover from current directory
395
+kb = KBContext.auto_discover()
396
+
397
+# Use with the agent
398
+from video_processor.agent.agent_loop import PlanningAgent
399
+from video_processor.agent.skills.base import AgentContext
400
+
401
+context = AgentContext(
402
+ knowledge_graph=kb.knowledge_graph,
403
+ query_engine=kb.query_engine,
404
+ provider_manager=pm,
405
+)
406
+agent = PlanningAgent(context)
407
+```
--- a/docs/api/agent.md
+++ b/docs/api/agent.md
@@ -0,0 +1,407 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/api/agent.md
+++ b/docs/api/agent.md
@@ -0,0 +1,407 @@
1 # Agent API Reference
2
3 ::: video_processor.agent.agent_loop
4
5 ::: video_processor.agent.skills.base
6
7 ::: video_processor.agent.kb_context
8
9 ---
10
11 ## Overview
12
13 The agent module implements a planning agent that synthesizes knowledge from processed video content into actionable artifacts such as project plans, PRDs, task breakdowns, and roadmaps. The agent operates on knowledge graphs loaded via `KBContext` and uses a skill-based architecture for extensibility.
14
15 **Key components:**
16
17 - **`PlanningAgent`** -- orchestrates skill selection and execution based on user requests
18 - **`AgentContext`** -- shared state passed between skills during execution
19 - **`Skill`** (ABC) -- base class for pluggable agent capabilities
20 - **`Artifact`** -- output produced by skill execution
21 - **`KBContext`** -- loads and merges multiple knowledge graph sources
22
23 ---
24
25 ## PlanningAgent
26
27 ```python
28 from video_processor.agent.agent_loop import PlanningAgent
29 ```
30
31 AI agent that synthesizes knowledge into planning artifacts. Uses an LLM to select which skills to execute for a given request, or falls back to keyword matching when no LLM is available.
32
33 ### Constructor
34
35 ```python
36 def __init__(self, context: AgentContext)
37 ```
38
39 | Parameter | Type | Description |
40 |---|---|---|
41 | `context` | `AgentContext` | Shared context containing knowledge graph, query engine, and provider |
42
43 ### from_kb_paths()
44
45 ```python
46 @classmethod
47 def from_kb_paths(
48 cls,
49 kb_paths: List[Path],
50 provider_manager=None,
51 ) -> PlanningAgent
52 ```
53
54 Factory method that creates an agent from one or more knowledge base file paths. Handles loading and merging knowledge graphs automatically.
55
56 **Parameters:**
57
58 | Parameter | Type | Default | Description |
59 |---|---|---|---|
60 | `kb_paths` | `List[Path]` | *required* | Paths to `.db` or `.json` knowledge graph files, or directories to search |
61 | `provider_manager` | `ProviderManager` | `None` | LLM provider for agent operations |
62
63 **Returns:** `PlanningAgent` -- configured agent with loaded knowledge base.
64
65 ```python
66 from pathlib import Path
67 from video_processor.agent.agent_loop import PlanningAgent
68 from video_processor.providers.manager import ProviderManager
69
70 agent = PlanningAgent.from_kb_paths(
71 kb_paths=[Path("results/knowledge_graph.db")],
72 provider_manager=ProviderManager(),
73 )
74 ```
75
76 ### execute()
77
78 ```python
79 def execute(self, request: str) -> List[Artifact]
80 ```
81
82 Execute a user request by selecting and running appropriate skills.
83
84 **Process:**
85
86 1. Build a context summary from the knowledge base statistics
87 2. Format available skills with their descriptions
88 3. Ask the LLM to select skills and parameters (or use keyword matching as fallback)
89 4. Execute selected skills in order, accumulating artifacts
90
91 **Parameters:**
92
93 | Parameter | Type | Description |
94 |---|---|---|
95 | `request` | `str` | Natural language request (e.g., "Generate a project plan") |
96
97 **Returns:** `List[Artifact]` -- generated artifacts from skill execution.
98
99 **LLM mode:** The LLM receives the knowledge base summary, available skills, and user request, then returns a JSON array of `{"skill": "name", "params": {}}` objects to execute.
100
101 **Keyword fallback:** Without an LLM, skills are matched by splitting the skill name into words and checking if any appear in the request text.
102
103 ```python
104 artifacts = agent.execute("Create a PRD and task breakdown")
105 for artifact in artifacts:
106 print(f"--- {artifact.name} ({artifact.artifact_type}) ---")
107 print(artifact.content[:500])
108 ```
109
110 ### chat()
111
112 ```python
113 def chat(self, message: str) -> str
114 ```
115
116 Interactive chat mode. Maintains conversation history and provides contextual responses about the loaded knowledge base.
117
118 **Parameters:**
119
120 | Parameter | Type | Description |
121 |---|---|---|
122 | `message` | `str` | User message |
123
124 **Returns:** `str` -- assistant response.
125
126 The chat mode provides the LLM with:
127
128 - Knowledge base statistics (entity counts, relationship counts)
129 - List of previously generated artifacts
130 - Full conversation history
131 - Available REPL commands (e.g., `/entities`, `/search`, `/plan`, `/export`)
132
133 **Requires** a configured `provider_manager`. Returns a static error message if no LLM is available.
134
135 ```python
136 response = agent.chat("What technologies were discussed in the meetings?")
137 print(response)
138
139 response = agent.chat("Which of those have the most dependencies?")
140 print(response)
141 ```
142
143 ---
144
145 ## AgentContext
146
147 ```python
148 from video_processor.agent.skills.base import AgentContext
149 ```
150
151 Shared state dataclass passed to all skills during execution. Accumulates artifacts and conversation history across the agent session.
152
153 | Field | Type | Default | Description |
154 |---|---|---|---|
155 | `knowledge_graph` | `Any` | `None` | `KnowledgeGraph` instance |
156 | `query_engine` | `Any` | `None` | `GraphQueryEngine` instance for querying the KG |
157 | `provider_manager` | `Any` | `None` | `ProviderManager` instance for LLM calls |
158 | `planning_entities` | `List[Any]` | `[]` | Extracted `PlanningEntity` instances |
159 | `user_requirements` | `Dict[str, Any]` | `{}` | User-specified requirements and constraints |
160 | `conversation_history` | `List[Dict[str, str]]` | `[]` | Chat message history (`role`, `content` dicts) |
161 | `artifacts` | `List[Artifact]` | `[]` | Previously generated artifacts |
162 | `config` | `Dict[str, Any]` | `{}` | Additional configuration |
163
164 ```python
165 from video_processor.agent.skills.base import AgentContext
166
167 context = AgentContext(
168 knowledge_graph=kg,
169 query_engine=engine,
170 provider_manager=pm,
171 config={"output_format": "markdown"},
172 )
173 ```
174
175 ---
176
177 ## Skill (ABC)
178
179 ```python
180 from video_processor.agent.skills.base import Skill
181 ```
182
183 Base class for agent skills. Each skill represents a discrete capability that produces an artifact from the agent context.
184
185 **Class attributes:**
186
187 | Attribute | Type | Description |
188 |---|---|---|
189 | `name` | `str` | Skill identifier (e.g., `"project_plan"`, `"prd"`) |
190 | `description` | `str` | Human-readable description shown to the LLM for skill selection |
191
192 ### execute()
193
194 ```python
195 @abstractmethod
196 def execute(self, context: AgentContext, **kwargs) -> Artifact
197 ```
198
199 Execute this skill and return an artifact. Receives the shared agent context and any parameters selected by the LLM planner.
200
201 ### can_execute()
202
203 ```python
204 def can_execute(self, context: AgentContext) -> bool
205 ```
206
207 Check if this skill can execute given the current context. The default implementation requires both `knowledge_graph` and `provider_manager` to be set. Override for skills with different requirements.
208
209 **Returns:** `bool`
210
211 ### Implementing a custom skill
212
213 ```python
214 from video_processor.agent.skills.base import Skill, Artifact, AgentContext, register_skill
215
216 class SummarySkill(Skill):
217 name = "summary"
218 description = "Generate a concise summary of the knowledge base"
219
220 def execute(self, context: AgentContext, **kwargs) -> Artifact:
221 stats = context.query_engine.stats()
222 prompt = f"Summarize this knowledge base:\n{stats.to_text()}"
223 content = context.provider_manager.chat(
224 [{"role": "user", "content": prompt}]
225 )
226 return Artifact(
227 name="Knowledge Base Summary",
228 content=content,
229 artifact_type="document",
230 format="markdown",
231 )
232
233 def can_execute(self, context: AgentContext) -> bool:
234 return context.query_engine is not None and context.provider_manager is not None
235
236 # Register the skill so the agent can discover it
237 register_skill(SummarySkill())
238 ```
239
240 ---
241
242 ## Artifact
243
244 ```python
245 from video_processor.agent.skills.base import Artifact
246 ```
247
248 Dataclass representing the output of a skill execution.
249
250 | Field | Type | Default | Description |
251 |---|---|---|---|
252 | `name` | `str` | *required* | Human-readable artifact name |
253 | `content` | `str` | *required* | Generated content (Markdown, JSON, Mermaid, etc.) |
254 | `artifact_type` | `str` | *required* | Type: `"project_plan"`, `"prd"`, `"roadmap"`, `"task_list"`, `"document"`, `"issues"` |
255 | `format` | `str` | `"markdown"` | Content format: `"markdown"`, `"json"`, `"mermaid"` |
256 | `metadata` | `Dict[str, Any]` | `{}` | Additional metadata |
257
258 ---
259
260 ## Skill Registry Functions
261
262 ### register_skill()
263
264 ```python
265 def register_skill(skill: Skill) -> None
266 ```
267
268 Register a skill instance in the global registry. Skills must be registered before the agent can discover and execute them.
269
270 ### get_skill()
271
272 ```python
273 def get_skill(name: str) -> Optional[Skill]
274 ```
275
276 Look up a registered skill by name.
277
278 **Returns:** `Optional[Skill]` -- the skill instance, or `None` if not found.
279
280 ### list_skills()
281
282 ```python
283 def list_skills() -> List[Skill]
284 ```
285
286 Return all registered skill instances.
287
288 ---
289
290 ## KBContext
291
292 ```python
293 from video_processor.agent.kb_context import KBContext
294 ```
295
296 Loads and merges multiple knowledge graph sources into a unified context for agent consumption. Supports both FalkorDB (`.db`) and JSON (`.json`) formats, and can auto-discover graphs in a directory tree.
297
298 ### Constructor
299
300 ```python
301 def __init__(self)
302 ```
303
304 Creates an empty context. Use `add_source()` to add knowledge graph paths, then `load()` to initialize.
305
306 ### add_source()
307
308 ```python
309 def add_source(self, path) -> None
310 ```
311
312 Add a knowledge graph source.
313
314 **Parameters:**
315
316 | Parameter | Type | Description |
317 |---|---|---|
318 | `path` | `str \| Path` | Path to a `.db` file, `.json` file, or directory to search for knowledge graphs |
319
320 If `path` is a directory, it is searched recursively for knowledge graph files using `find_knowledge_graphs()`.
321
322 **Raises:** `FileNotFoundError` if the path does not exist.
323
324 ### load()
325
326 ```python
327 def load(self, provider_manager=None) -> KBContext
328 ```
329
330 Load and merge all added sources into a single knowledge graph and query engine.
331
332 **Parameters:**
333
334 | Parameter | Type | Default | Description |
335 |---|---|---|---|
336 | `provider_manager` | `ProviderManager` | `None` | LLM provider for the knowledge graph and query engine |
337
338 **Returns:** `KBContext` -- self, for method chaining.
339
340 ### Properties
341
342 | Property | Type | Description |
343 |---|---|---|
344 | `knowledge_graph` | `KnowledgeGraph` | The merged knowledge graph (raises `RuntimeError` if not loaded) |
345 | `query_engine` | `GraphQueryEngine` | Query engine for the merged graph (raises `RuntimeError` if not loaded) |
346 | `sources` | `List[Path]` | List of resolved source paths |
347
348 ### summary()
349
350 ```python
351 def summary(self) -> str
352 ```
353
354 Generate a brief text summary of the loaded knowledge base, including entity counts by type and relationship counts.
355
356 **Returns:** `str` -- multi-line summary text.
357
358 ### auto_discover()
359
360 ```python
361 @classmethod
362 def auto_discover(
363 cls,
364 start_dir: Optional[Path] = None,
365 provider_manager=None,
366 ) -> KBContext
367 ```
368
369 Factory method that creates a `KBContext` by auto-discovering knowledge graphs near `start_dir` (defaults to current directory).
370
371 **Returns:** `KBContext` -- loaded context (may have zero sources if none found).
372
373 ### Usage examples
374
375 ```python
376 from pathlib import Path
377 from video_processor.agent.kb_context import KBContext
378
379 # Manual source management
380 kb = KBContext()
381 kb.add_source(Path("project_a/knowledge_graph.db"))
382 kb.add_source(Path("project_b/results/")) # searches directory
383 kb.load(provider_manager=pm)
384
385 print(kb.summary())
386 # Knowledge base: 3 source(s)
387 # Entities: 142
388 # Relationships: 89
389 # Entity types:
390 # technology: 45
391 # person: 23
392 # concept: 74
393
394 # Auto-discover from current directory
395 kb = KBContext.auto_discover()
396
397 # Use with the agent
398 from video_processor.agent.agent_loop import PlanningAgent
399 from video_processor.agent.skills.base import AgentContext
400
401 context = AgentContext(
402 knowledge_graph=kb.knowledge_graph,
403 query_engine=kb.query_engine,
404 provider_manager=pm,
405 )
406 agent = PlanningAgent(context)
407 ```
--- docs/api/analyzers.md
+++ docs/api/analyzers.md
@@ -3,5 +3,387 @@
33
::: video_processor.analyzers.diagram_analyzer
44
55
::: video_processor.analyzers.content_analyzer
66
77
::: video_processor.analyzers.action_detector
8
+
9
+---
10
+
11
+## Overview
12
+
13
+The analyzers module contains the core content extraction logic for PlanOpticon. These analyzers process video frames and transcripts to extract structured knowledge: diagrams, key points, action items, and cross-referenced entities.
14
+
15
+All analyzers accept an optional `ProviderManager` instance. When provided, they use LLM capabilities for richer extraction. Without one, they fall back to heuristic/pattern-based methods where possible.
16
+
17
+---
18
+
19
+## DiagramAnalyzer
20
+
21
+```python
22
+from video_processor.analyzers.diagram_analyzer import DiagramAnalyzer
23
+```
24
+
25
+Vision model-based diagram detection and analysis. Classifies video frames as diagrams, slides, screenshots, or other content, then performs full extraction on high-confidence frames.
26
+
27
+### Constructor
28
+
29
+```python
30
+def __init__(
31
+ self,
32
+ provider_manager: Optional[ProviderManager] = None,
33
+ confidence_threshold: float = 0.3,
34
+)
35
+```
36
+
37
+| Parameter | Type | Default | Description |
38
+|---|---|---|---|
39
+| `provider_manager` | `Optional[ProviderManager]` | `None` | LLM provider (creates a default if not provided) |
40
+| `confidence_threshold` | `float` | `0.3` | Minimum confidence to process a frame at all |
41
+
42
+### classify_frame()
43
+
44
+```python
45
+def classify_frame(self, image_path: Union[str, Path]) -> dict
46
+```
47
+
48
+Classify a single frame using a vision model. Determines whether the frame contains a diagram, slide, or other visual content worth extracting.
49
+
50
+**Parameters:**
51
+
52
+| Parameter | Type | Description |
53
+|---|---|---|
54
+| `image_path` | `Union[str, Path]` | Path to the frame image file |
55
+
56
+**Returns:** `dict` with the following keys:
57
+
58
+| Key | Type | Description |
59
+|---|---|---|
60
+| `is_diagram` | `bool` | Whether the frame contains extractable content |
61
+| `diagram_type` | `str` | One of: `flowchart`, `sequence`, `architecture`, `whiteboard`, `chart`, `table`, `slide`, `screenshot`, `unknown` |
62
+| `confidence` | `float` | Detection confidence from 0.0 to 1.0 |
63
+| `content_type` | `str` | Content category: `slide`, `diagram`, `document`, `screen_share`, `whiteboard`, `chart`, `person`, `other` |
64
+| `brief_description` | `str` | One-sentence description of the frame content |
65
+
66
+**Important:** Frames showing people, webcam feeds, or video conference participant views return `confidence: 0.0`. The classifier is tuned to detect only shared/presented content.
67
+
68
+```python
69
+analyzer = DiagramAnalyzer()
70
+result = analyzer.classify_frame("/path/to/frame_042.jpg")
71
+if result["confidence"] >= 0.7:
72
+ print(f"Diagram detected: {result['diagram_type']}")
73
+```
74
+
75
+### analyze_diagram_single_pass()
76
+
77
+```python
78
+def analyze_diagram_single_pass(self, image_path: Union[str, Path]) -> dict
79
+```
80
+
81
+Full single-pass diagram analysis. Extracts description, text content, elements, relationships, Mermaid syntax, and chart data in a single LLM call.
82
+
83
+**Returns:** `dict` with the following keys:
84
+
85
+| Key | Type | Description |
86
+|---|---|---|
87
+| `diagram_type` | `str` | Diagram classification |
88
+| `description` | `str` | Detailed description of the visual content |
89
+| `text_content` | `str` | All visible text, preserving structure |
90
+| `elements` | `list[str]` | Identified elements/components |
91
+| `relationships` | `list[str]` | Relationships in `"A -> B: label"` format |
92
+| `mermaid` | `str` | Valid Mermaid diagram syntax |
93
+| `chart_data` | `dict \| None` | Chart data with `labels`, `values`, `chart_type` (only for data charts) |
94
+
95
+Returns an empty `dict` on failure.
96
+
97
+### caption_frame()
98
+
99
+```python
100
+def caption_frame(self, image_path: Union[str, Path]) -> str
101
+```
102
+
103
+Get a brief 1-2 sentence caption for a frame. Used as a fallback when full diagram analysis is not warranted.
104
+
105
+**Returns:** `str` -- a brief description of the frame content.
106
+
107
+### process_frames()
108
+
109
+```python
110
+def process_frames(
111
+ self,
112
+ frame_paths: List[Union[str, Path]],
113
+ diagrams_dir: Optional[Path] = None,
114
+ captures_dir: Optional[Path] = None,
115
+) -> Tuple[List[DiagramResult], List[ScreenCapture]]
116
+```
117
+
118
+Process a batch of extracted video frames through the full classification and analysis pipeline.
119
+
120
+**Parameters:**
121
+
122
+| Parameter | Type | Default | Description |
123
+|---|---|---|---|
124
+| `frame_paths` | `List[Union[str, Path]]` | *required* | Paths to frame images |
125
+| `diagrams_dir` | `Optional[Path]` | `None` | Output directory for diagram files (images, mermaid, JSON) |
126
+| `captures_dir` | `Optional[Path]` | `None` | Output directory for screengrab fallback files |
127
+
128
+**Returns:** `Tuple[List[DiagramResult], List[ScreenCapture]]`
129
+
130
+**Confidence thresholds:**
131
+
132
+| Confidence Range | Action |
133
+|---|---|
134
+| >= 0.7 | Full diagram analysis -- extracts elements, relationships, Mermaid syntax |
135
+| 0.3 to 0.7 | Screengrab fallback -- saves frame with a brief caption |
136
+| < 0.3 | Skipped entirely |
137
+
138
+**Output files (when directories are provided):**
139
+
140
+For diagrams (`diagrams_dir`):
141
+
142
+- `diagram_N.jpg` -- original frame image
143
+- `diagram_N.mermaid` -- Mermaid source (if generated)
144
+- `diagram_N.json` -- full DiagramResult as JSON
145
+
146
+For screen captures (`captures_dir`):
147
+
148
+- `capture_N.jpg` -- original frame image
149
+- `capture_N.json` -- ScreenCapture metadata as JSON
150
+
151
+```python
152
+from pathlib import Path
153
+from video_processor.analyzers.diagram_analyzer import DiagramAnalyzer
154
+from video_processor.providers.manager import ProviderManager
155
+
156
+analyzer = DiagramAnalyzer(
157
+ provider_manager=ProviderManager(),
158
+ confidence_threshold=0.3,
159
+)
160
+
161
+frame_paths = list(Path("output/frames").glob("*.jpg"))
162
+diagrams, captures = analyzer.process_frames(
163
+ frame_paths,
164
+ diagrams_dir=Path("output/diagrams"),
165
+ captures_dir=Path("output/captures"),
166
+)
167
+
168
+print(f"Found {len(diagrams)} diagrams, {len(captures)} screengrabs")
169
+for d in diagrams:
170
+ print(f" [{d.diagram_type.value}] {d.description}")
171
+```
172
+
173
+---
174
+
175
+## ContentAnalyzer
176
+
177
+```python
178
+from video_processor.analyzers.content_analyzer import ContentAnalyzer
179
+```
180
+
181
+Cross-references transcript and diagram entities for richer knowledge extraction. Merges entities found in different sources and enriches key points with diagram links.
182
+
183
+### Constructor
184
+
185
+```python
186
+def __init__(self, provider_manager: Optional[ProviderManager] = None)
187
+```
188
+
189
+| Parameter | Type | Default | Description |
190
+|---|---|---|---|
191
+| `provider_manager` | `Optional[ProviderManager]` | `None` | Required for LLM-based fuzzy matching |
192
+
193
+### cross_reference()
194
+
195
+```python
196
+def cross_reference(
197
+ self,
198
+ transcript_entities: List[Entity],
199
+ diagram_entities: List[Entity],
200
+) -> List[Entity]
201
+```
202
+
203
+Merge entities from transcripts and diagrams into a unified list with source attribution.
204
+
205
+**Merge strategy:**
206
+
207
+1. Index all transcript entities by lowercase name, marked with `source="transcript"`
208
+2. Merge diagram entities: if a name matches, set `source="both"` and combine descriptions/occurrences; otherwise add as `source="diagram"`
209
+3. If a `ProviderManager` is available, use LLM fuzzy matching to find additional matches among unmatched entities (e.g., "PostgreSQL" from transcript matching "Postgres" from diagram)
210
+
211
+**Parameters:**
212
+
213
+| Parameter | Type | Description |
214
+|---|---|---|
215
+| `transcript_entities` | `List[Entity]` | Entities extracted from transcript |
216
+| `diagram_entities` | `List[Entity]` | Entities extracted from diagrams |
217
+
218
+**Returns:** `List[Entity]` -- merged entity list with `source` attribution.
219
+
220
+```python
221
+from video_processor.analyzers.content_analyzer import ContentAnalyzer
222
+from video_processor.models import Entity
223
+
224
+analyzer = ContentAnalyzer(provider_manager=pm)
225
+
226
+transcript_entities = [
227
+ Entity(name="PostgreSQL", type="technology"),
228
+ Entity(name="Alice", type="person"),
229
+]
230
+diagram_entities = [
231
+ Entity(name="Postgres", type="technology"),
232
+ Entity(name="Redis", type="technology"),
233
+]
234
+
235
+merged = analyzer.cross_reference(transcript_entities, diagram_entities)
236
+# "PostgreSQL" and "Postgres" may be fuzzy-matched and merged
237
+```
238
+
239
+### enrich_key_points()
240
+
241
+```python
242
+def enrich_key_points(
243
+ self,
244
+ key_points: List[KeyPoint],
245
+ diagrams: list,
246
+ transcript_text: str,
247
+) -> List[KeyPoint]
248
+```
249
+
250
+Link key points to relevant diagrams by entity overlap. Examines word overlap between key point text and diagram elements/text content.
251
+
252
+**Parameters:**
253
+
254
+| Parameter | Type | Description |
255
+|---|---|---|
256
+| `key_points` | `List[KeyPoint]` | Key points to enrich |
257
+| `diagrams` | `list` | List of `DiagramResult` objects or dicts |
258
+| `transcript_text` | `str` | Full transcript text (reserved for future use) |
259
+
260
+**Returns:** `List[KeyPoint]` -- key points with `related_diagrams` indices populated.
261
+
262
+A key point is linked to a diagram when they share 2 or more words (excluding short words) between the key point text/details and the diagram's elements/text content.
263
+
264
+---
265
+
266
+## ActionDetector
267
+
268
+```python
269
+from video_processor.analyzers.action_detector import ActionDetector
270
+```
271
+
272
+Detects action items from transcripts and diagram content using LLM extraction with a regex pattern fallback.
273
+
274
+### Constructor
275
+
276
+```python
277
+def __init__(self, provider_manager: Optional[ProviderManager] = None)
278
+```
279
+
280
+| Parameter | Type | Default | Description |
281
+|---|---|---|---|
282
+| `provider_manager` | `Optional[ProviderManager]` | `None` | Required for LLM-based extraction |
283
+
284
+### detect_from_transcript()
285
+
286
+```python
287
+def detect_from_transcript(
288
+ self,
289
+ text: str,
290
+ segments: Optional[List[TranscriptSegment]] = None,
291
+) -> List[ActionItem]
292
+```
293
+
294
+Detect action items from transcript text.
295
+
296
+**Parameters:**
297
+
298
+| Parameter | Type | Default | Description |
299
+|---|---|---|---|
300
+| `text` | `str` | *required* | Transcript text to analyze |
301
+| `segments` | `Optional[List[TranscriptSegment]]` | `None` | Transcript segments for timestamp attachment |
302
+
303
+**Returns:** `List[ActionItem]` -- detected action items with `source="transcript"`.
304
+
305
+**Extraction modes:**
306
+
307
+- **LLM mode** (when `provider_manager` is set): Sends the transcript to the LLM with a structured extraction prompt. Extracts action, assignee, deadline, priority, and context.
308
+- **Pattern mode** (fallback): Matches sentences against regex patterns for action-oriented language.
309
+
310
+**Pattern matching** detects sentences containing:
311
+
312
+- "need/needs to", "should/must/shall"
313
+- "will/going to", "action item/todo/follow-up"
314
+- "assigned to/responsible for", "deadline/due by"
315
+- "let's/let us", "make sure/ensure"
316
+- "can you/could you/please"
317
+
318
+**Timestamp attachment:** When `segments` are provided, each action item is matched to the most relevant transcript segment (by word overlap, minimum 3 matching words), and a timestamp is added to `context`.
319
+
320
+### detect_from_diagrams()
321
+
322
+```python
323
+def detect_from_diagrams(self, diagrams: list) -> List[ActionItem]
324
+```
325
+
326
+Extract action items from diagram text content and elements. Processes each diagram's combined text using either LLM or pattern extraction.
327
+
328
+**Parameters:**
329
+
330
+| Parameter | Type | Description |
331
+|---|---|---|
332
+| `diagrams` | `list` | List of `DiagramResult` objects or dicts |
333
+
334
+**Returns:** `List[ActionItem]` -- action items with `source="diagram"`.
335
+
336
+### merge_action_items()
337
+
338
+```python
339
+def merge_action_items(
340
+ self,
341
+ transcript_items: List[ActionItem],
342
+ diagram_items: List[ActionItem],
343
+) -> List[ActionItem]
344
+```
345
+
346
+Merge action items from multiple sources, deduplicating by action text (case-insensitive, whitespace-normalized).
347
+
348
+**Returns:** `List[ActionItem]` -- deduplicated merged list.
349
+
350
+### Usage example
351
+
352
+```python
353
+from video_processor.analyzers.action_detector import ActionDetector
354
+from video_processor.providers.manager import ProviderManager
355
+
356
+detector = ActionDetector(provider_manager=ProviderManager())
357
+
358
+# From transcript
359
+transcript_items = detector.detect_from_transcript(
360
+ text="Alice needs to update the API docs by Friday. "
361
+ "Bob should review the PR before merging.",
362
+ segments=transcript_segments,
363
+)
364
+
365
+# From diagrams
366
+diagram_items = detector.detect_from_diagrams(diagram_results)
367
+
368
+# Merge and deduplicate
369
+all_items = detector.merge_action_items(transcript_items, diagram_items)
370
+
371
+for item in all_items:
372
+ print(f"[{item.priority or 'unset'}] {item.action}")
373
+ if item.assignee:
374
+ print(f" Assignee: {item.assignee}")
375
+ if item.deadline:
376
+ print(f" Deadline: {item.deadline}")
377
+```
378
+
379
+### Pattern fallback (no LLM)
380
+
381
+```python
382
+# Works without any API keys
383
+detector = ActionDetector() # No provider_manager
384
+items = detector.detect_from_transcript(
385
+ "We need to finalize the database schema. "
386
+ "Please update the deployment scripts."
387
+)
388
+# Returns ActionItems matched by regex patterns
389
+```
8390
9391
ADDED docs/api/auth.md
--- docs/api/analyzers.md
+++ docs/api/analyzers.md
@@ -3,5 +3,387 @@
3 ::: video_processor.analyzers.diagram_analyzer
4
5 ::: video_processor.analyzers.content_analyzer
6
7 ::: video_processor.analyzers.action_detector
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
9 DDED docs/api/auth.md
--- docs/api/analyzers.md
+++ docs/api/analyzers.md
@@ -3,5 +3,387 @@
3 ::: video_processor.analyzers.diagram_analyzer
4
5 ::: video_processor.analyzers.content_analyzer
6
7 ::: video_processor.analyzers.action_detector
8
9 ---
10
11 ## Overview
12
13 The analyzers module contains the core content extraction logic for PlanOpticon. These analyzers process video frames and transcripts to extract structured knowledge: diagrams, key points, action items, and cross-referenced entities.
14
15 All analyzers accept an optional `ProviderManager` instance. When provided, they use LLM capabilities for richer extraction. Without one, they fall back to heuristic/pattern-based methods where possible.
16
17 ---
18
19 ## DiagramAnalyzer
20
21 ```python
22 from video_processor.analyzers.diagram_analyzer import DiagramAnalyzer
23 ```
24
25 Vision model-based diagram detection and analysis. Classifies video frames as diagrams, slides, screenshots, or other content, then performs full extraction on high-confidence frames.
26
27 ### Constructor
28
29 ```python
30 def __init__(
31 self,
32 provider_manager: Optional[ProviderManager] = None,
33 confidence_threshold: float = 0.3,
34 )
35 ```
36
37 | Parameter | Type | Default | Description |
38 |---|---|---|---|
39 | `provider_manager` | `Optional[ProviderManager]` | `None` | LLM provider (creates a default if not provided) |
40 | `confidence_threshold` | `float` | `0.3` | Minimum confidence to process a frame at all |
41
42 ### classify_frame()
43
44 ```python
45 def classify_frame(self, image_path: Union[str, Path]) -> dict
46 ```
47
48 Classify a single frame using a vision model. Determines whether the frame contains a diagram, slide, or other visual content worth extracting.
49
50 **Parameters:**
51
52 | Parameter | Type | Description |
53 |---|---|---|
54 | `image_path` | `Union[str, Path]` | Path to the frame image file |
55
56 **Returns:** `dict` with the following keys:
57
58 | Key | Type | Description |
59 |---|---|---|
60 | `is_diagram` | `bool` | Whether the frame contains extractable content |
61 | `diagram_type` | `str` | One of: `flowchart`, `sequence`, `architecture`, `whiteboard`, `chart`, `table`, `slide`, `screenshot`, `unknown` |
62 | `confidence` | `float` | Detection confidence from 0.0 to 1.0 |
63 | `content_type` | `str` | Content category: `slide`, `diagram`, `document`, `screen_share`, `whiteboard`, `chart`, `person`, `other` |
64 | `brief_description` | `str` | One-sentence description of the frame content |
65
66 **Important:** Frames showing people, webcam feeds, or video conference participant views return `confidence: 0.0`. The classifier is tuned to detect only shared/presented content.
67
68 ```python
69 analyzer = DiagramAnalyzer()
70 result = analyzer.classify_frame("/path/to/frame_042.jpg")
71 if result["confidence"] >= 0.7:
72 print(f"Diagram detected: {result['diagram_type']}")
73 ```
74
75 ### analyze_diagram_single_pass()
76
77 ```python
78 def analyze_diagram_single_pass(self, image_path: Union[str, Path]) -> dict
79 ```
80
81 Full single-pass diagram analysis. Extracts description, text content, elements, relationships, Mermaid syntax, and chart data in a single LLM call.
82
83 **Returns:** `dict` with the following keys:
84
85 | Key | Type | Description |
86 |---|---|---|
87 | `diagram_type` | `str` | Diagram classification |
88 | `description` | `str` | Detailed description of the visual content |
89 | `text_content` | `str` | All visible text, preserving structure |
90 | `elements` | `list[str]` | Identified elements/components |
91 | `relationships` | `list[str]` | Relationships in `"A -> B: label"` format |
92 | `mermaid` | `str` | Valid Mermaid diagram syntax |
93 | `chart_data` | `dict \| None` | Chart data with `labels`, `values`, `chart_type` (only for data charts) |
94
95 Returns an empty `dict` on failure.
96
97 ### caption_frame()
98
99 ```python
100 def caption_frame(self, image_path: Union[str, Path]) -> str
101 ```
102
103 Get a brief 1-2 sentence caption for a frame. Used as a fallback when full diagram analysis is not warranted.
104
105 **Returns:** `str` -- a brief description of the frame content.
106
107 ### process_frames()
108
109 ```python
110 def process_frames(
111 self,
112 frame_paths: List[Union[str, Path]],
113 diagrams_dir: Optional[Path] = None,
114 captures_dir: Optional[Path] = None,
115 ) -> Tuple[List[DiagramResult], List[ScreenCapture]]
116 ```
117
118 Process a batch of extracted video frames through the full classification and analysis pipeline.
119
120 **Parameters:**
121
122 | Parameter | Type | Default | Description |
123 |---|---|---|---|
124 | `frame_paths` | `List[Union[str, Path]]` | *required* | Paths to frame images |
125 | `diagrams_dir` | `Optional[Path]` | `None` | Output directory for diagram files (images, mermaid, JSON) |
126 | `captures_dir` | `Optional[Path]` | `None` | Output directory for screengrab fallback files |
127
128 **Returns:** `Tuple[List[DiagramResult], List[ScreenCapture]]`
129
130 **Confidence thresholds:**
131
132 | Confidence Range | Action |
133 |---|---|
134 | >= 0.7 | Full diagram analysis -- extracts elements, relationships, Mermaid syntax |
135 | 0.3 to 0.7 | Screengrab fallback -- saves frame with a brief caption |
136 | < 0.3 | Skipped entirely |
137
138 **Output files (when directories are provided):**
139
140 For diagrams (`diagrams_dir`):
141
142 - `diagram_N.jpg` -- original frame image
143 - `diagram_N.mermaid` -- Mermaid source (if generated)
144 - `diagram_N.json` -- full DiagramResult as JSON
145
146 For screen captures (`captures_dir`):
147
148 - `capture_N.jpg` -- original frame image
149 - `capture_N.json` -- ScreenCapture metadata as JSON
150
151 ```python
152 from pathlib import Path
153 from video_processor.analyzers.diagram_analyzer import DiagramAnalyzer
154 from video_processor.providers.manager import ProviderManager
155
156 analyzer = DiagramAnalyzer(
157 provider_manager=ProviderManager(),
158 confidence_threshold=0.3,
159 )
160
161 frame_paths = list(Path("output/frames").glob("*.jpg"))
162 diagrams, captures = analyzer.process_frames(
163 frame_paths,
164 diagrams_dir=Path("output/diagrams"),
165 captures_dir=Path("output/captures"),
166 )
167
168 print(f"Found {len(diagrams)} diagrams, {len(captures)} screengrabs")
169 for d in diagrams:
170 print(f" [{d.diagram_type.value}] {d.description}")
171 ```
172
173 ---
174
175 ## ContentAnalyzer
176
177 ```python
178 from video_processor.analyzers.content_analyzer import ContentAnalyzer
179 ```
180
181 Cross-references transcript and diagram entities for richer knowledge extraction. Merges entities found in different sources and enriches key points with diagram links.
182
183 ### Constructor
184
185 ```python
186 def __init__(self, provider_manager: Optional[ProviderManager] = None)
187 ```
188
189 | Parameter | Type | Default | Description |
190 |---|---|---|---|
191 | `provider_manager` | `Optional[ProviderManager]` | `None` | Required for LLM-based fuzzy matching |
192
193 ### cross_reference()
194
195 ```python
196 def cross_reference(
197 self,
198 transcript_entities: List[Entity],
199 diagram_entities: List[Entity],
200 ) -> List[Entity]
201 ```
202
203 Merge entities from transcripts and diagrams into a unified list with source attribution.
204
205 **Merge strategy:**
206
207 1. Index all transcript entities by lowercase name, marked with `source="transcript"`
208 2. Merge diagram entities: if a name matches, set `source="both"` and combine descriptions/occurrences; otherwise add as `source="diagram"`
209 3. If a `ProviderManager` is available, use LLM fuzzy matching to find additional matches among unmatched entities (e.g., "PostgreSQL" from transcript matching "Postgres" from diagram)
210
211 **Parameters:**
212
213 | Parameter | Type | Description |
214 |---|---|---|
215 | `transcript_entities` | `List[Entity]` | Entities extracted from transcript |
216 | `diagram_entities` | `List[Entity]` | Entities extracted from diagrams |
217
218 **Returns:** `List[Entity]` -- merged entity list with `source` attribution.
219
220 ```python
221 from video_processor.analyzers.content_analyzer import ContentAnalyzer
222 from video_processor.models import Entity
223
224 analyzer = ContentAnalyzer(provider_manager=pm)
225
226 transcript_entities = [
227 Entity(name="PostgreSQL", type="technology"),
228 Entity(name="Alice", type="person"),
229 ]
230 diagram_entities = [
231 Entity(name="Postgres", type="technology"),
232 Entity(name="Redis", type="technology"),
233 ]
234
235 merged = analyzer.cross_reference(transcript_entities, diagram_entities)
236 # "PostgreSQL" and "Postgres" may be fuzzy-matched and merged
237 ```
238
239 ### enrich_key_points()
240
241 ```python
242 def enrich_key_points(
243 self,
244 key_points: List[KeyPoint],
245 diagrams: list,
246 transcript_text: str,
247 ) -> List[KeyPoint]
248 ```
249
250 Link key points to relevant diagrams by entity overlap. Examines word overlap between key point text and diagram elements/text content.
251
252 **Parameters:**
253
254 | Parameter | Type | Description |
255 |---|---|---|
256 | `key_points` | `List[KeyPoint]` | Key points to enrich |
257 | `diagrams` | `list` | List of `DiagramResult` objects or dicts |
258 | `transcript_text` | `str` | Full transcript text (reserved for future use) |
259
260 **Returns:** `List[KeyPoint]` -- key points with `related_diagrams` indices populated.
261
262 A key point is linked to a diagram when they share 2 or more words (excluding short words) between the key point text/details and the diagram's elements/text content.
263
264 ---
265
266 ## ActionDetector
267
268 ```python
269 from video_processor.analyzers.action_detector import ActionDetector
270 ```
271
272 Detects action items from transcripts and diagram content using LLM extraction with a regex pattern fallback.
273
274 ### Constructor
275
276 ```python
277 def __init__(self, provider_manager: Optional[ProviderManager] = None)
278 ```
279
280 | Parameter | Type | Default | Description |
281 |---|---|---|---|
282 | `provider_manager` | `Optional[ProviderManager]` | `None` | Required for LLM-based extraction |
283
284 ### detect_from_transcript()
285
286 ```python
287 def detect_from_transcript(
288 self,
289 text: str,
290 segments: Optional[List[TranscriptSegment]] = None,
291 ) -> List[ActionItem]
292 ```
293
294 Detect action items from transcript text.
295
296 **Parameters:**
297
298 | Parameter | Type | Default | Description |
299 |---|---|---|---|
300 | `text` | `str` | *required* | Transcript text to analyze |
301 | `segments` | `Optional[List[TranscriptSegment]]` | `None` | Transcript segments for timestamp attachment |
302
303 **Returns:** `List[ActionItem]` -- detected action items with `source="transcript"`.
304
305 **Extraction modes:**
306
307 - **LLM mode** (when `provider_manager` is set): Sends the transcript to the LLM with a structured extraction prompt. Extracts action, assignee, deadline, priority, and context.
308 - **Pattern mode** (fallback): Matches sentences against regex patterns for action-oriented language.
309
310 **Pattern matching** detects sentences containing:
311
312 - "need/needs to", "should/must/shall"
313 - "will/going to", "action item/todo/follow-up"
314 - "assigned to/responsible for", "deadline/due by"
315 - "let's/let us", "make sure/ensure"
316 - "can you/could you/please"
317
318 **Timestamp attachment:** When `segments` are provided, each action item is matched to the most relevant transcript segment (by word overlap, minimum 3 matching words), and a timestamp is added to `context`.
319
320 ### detect_from_diagrams()
321
322 ```python
323 def detect_from_diagrams(self, diagrams: list) -> List[ActionItem]
324 ```
325
326 Extract action items from diagram text content and elements. Processes each diagram's combined text using either LLM or pattern extraction.
327
328 **Parameters:**
329
330 | Parameter | Type | Description |
331 |---|---|---|
332 | `diagrams` | `list` | List of `DiagramResult` objects or dicts |
333
334 **Returns:** `List[ActionItem]` -- action items with `source="diagram"`.
335
336 ### merge_action_items()
337
338 ```python
339 def merge_action_items(
340 self,
341 transcript_items: List[ActionItem],
342 diagram_items: List[ActionItem],
343 ) -> List[ActionItem]
344 ```
345
346 Merge action items from multiple sources, deduplicating by action text (case-insensitive, whitespace-normalized).
347
348 **Returns:** `List[ActionItem]` -- deduplicated merged list.
349
350 ### Usage example
351
352 ```python
353 from video_processor.analyzers.action_detector import ActionDetector
354 from video_processor.providers.manager import ProviderManager
355
356 detector = ActionDetector(provider_manager=ProviderManager())
357
358 # From transcript
359 transcript_items = detector.detect_from_transcript(
360 text="Alice needs to update the API docs by Friday. "
361 "Bob should review the PR before merging.",
362 segments=transcript_segments,
363 )
364
365 # From diagrams
366 diagram_items = detector.detect_from_diagrams(diagram_results)
367
368 # Merge and deduplicate
369 all_items = detector.merge_action_items(transcript_items, diagram_items)
370
371 for item in all_items:
372 print(f"[{item.priority or 'unset'}] {item.action}")
373 if item.assignee:
374 print(f" Assignee: {item.assignee}")
375 if item.deadline:
376 print(f" Deadline: {item.deadline}")
377 ```
378
379 ### Pattern fallback (no LLM)
380
381 ```python
382 # Works without any API keys
383 detector = ActionDetector() # No provider_manager
384 items = detector.detect_from_transcript(
385 "We need to finalize the database schema. "
386 "Please update the deployment scripts."
387 )
388 # Returns ActionItems matched by regex patterns
389 ```
390
391 DDED docs/api/auth.md
--- a/docs/api/auth.md
+++ b/docs/api/auth.md
@@ -0,0 +1,377 @@
1
+# Auth API Reference
2
+
3
+::: video_processor.auth
4
+
5
+---
6
+
7
+## Overview
8
+
9
+The `video_processor.auth` module provides a unified OAuth and authentication strategy for all PlanOpticon source connectors. It supports multiple authentication methods tried in a consistent order:
10
+
11
+1. **Saved token** -- load from disk, auto-refresh if expired
12
+2. **Client Credentials** -- server-to-server OAuth (e.g., Zoom S2S)
13
+3. **OAuth 2.0 PKCE** -- interactive Authorization Code flow with PKCE
14
+4. **API key fallback** -- environment variable lookup
15
+
16
+Tokens are persisted to `~/.planopticon/` and automatically refreshed on expiry.
17
+
18
+---
19
+
20
+## AuthConfig
21
+
22
+```python
23
+from video_processor.auth import AuthConfig
24
+```
25
+
26
+Dataclass configuring authentication for a specific service. Defines OAuth endpoints, client credentials, API key fallback, scopes, and token storage.
27
+
28
+### Fields
29
+
30
+| Field | Type | Default | Description |
31
+|---|---|---|---|
32
+| `service` | `str` | *required* | Service identifier (e.g., `"zoom"`, `"notion"`) |
33
+| `oauth_authorize_url` | `Optional[str]` | `None` | OAuth authorization endpoint URL |
34
+| `oauth_token_url` | `Optional[str]` | `None` | OAuth token exchange endpoint URL |
35
+| `client_id` | `Optional[str]` | `None` | OAuth client ID (direct value) |
36
+| `client_secret` | `Optional[str]` | `None` | OAuth client secret (direct value) |
37
+| `client_id_env` | `Optional[str]` | `None` | Environment variable for client ID |
38
+| `client_secret_env` | `Optional[str]` | `None` | Environment variable for client secret |
39
+| `api_key_env` | `Optional[str]` | `None` | Environment variable for API key fallback |
40
+| `scopes` | `List[str]` | `[]` | OAuth scopes to request |
41
+| `redirect_uri` | `str` | `"urn:ietf:wg:oauth:2.0:oob"` | Redirect URI for auth code flow |
42
+| `account_id` | `Optional[str]` | `None` | Account ID for client credentials grant (direct value) |
43
+| `account_id_env` | `Optional[str]` | `None` | Environment variable for account ID |
44
+| `token_path` | `Optional[Path]` | `None` | Custom token storage path |
45
+
46
+### Resolved Properties
47
+
48
+These properties resolve values by checking the direct field first, then falling back to the environment variable.
49
+
50
+| Property | Return Type | Description |
51
+|---|---|---|
52
+| `resolved_client_id` | `Optional[str]` | Client ID from `client_id` or `os.environ[client_id_env]` |
53
+| `resolved_client_secret` | `Optional[str]` | Client secret from `client_secret` or `os.environ[client_secret_env]` |
54
+| `resolved_api_key` | `Optional[str]` | API key from `os.environ[api_key_env]` |
55
+| `resolved_account_id` | `Optional[str]` | Account ID from `account_id` or `os.environ[account_id_env]` |
56
+| `resolved_token_path` | `Path` | Token file path: `token_path` or `~/.planopticon/{service}_token.json` |
57
+| `supports_oauth` | `bool` | `True` if both `oauth_authorize_url` and `oauth_token_url` are set |
58
+
59
+```python
60
+from video_processor.auth import AuthConfig
61
+
62
+config = AuthConfig(
63
+ service="notion",
64
+ oauth_authorize_url="https://api.notion.com/v1/oauth/authorize",
65
+ oauth_token_url="https://api.notion.com/v1/oauth/token",
66
+ client_id_env="NOTION_CLIENT_ID",
67
+ client_secret_env="NOTION_CLIENT_SECRET",
68
+ api_key_env="NOTION_API_KEY",
69
+ scopes=["read_content"],
70
+)
71
+
72
+# Check resolved values
73
+print(config.resolved_client_id) # From NOTION_CLIENT_ID env var
74
+print(config.supports_oauth) # True
75
+print(config.resolved_token_path) # ~/.planopticon/notion_token.json
76
+```
77
+
78
+---
79
+
80
+## AuthResult
81
+
82
+```python
83
+from video_processor.auth import AuthResult
84
+```
85
+
86
+Dataclass representing the result of an authentication attempt.
87
+
88
+| Field | Type | Default | Description |
89
+|---|---|---|---|
90
+| `success` | `bool` | *required* | Whether authentication succeeded |
91
+| `access_token` | `Optional[str]` | `None` | The access token (if successful) |
92
+| `method` | `Optional[str]` | `None` | Auth method used: `"saved_token"`, `"oauth_pkce"`, `"client_credentials"`, `"api_key"` |
93
+| `expires_at` | `Optional[float]` | `None` | Token expiration as Unix timestamp |
94
+| `refresh_token` | `Optional[str]` | `None` | OAuth refresh token (if available) |
95
+| `error` | `Optional[str]` | `None` | Error message (if failed) |
96
+
97
+```python
98
+result = manager.authenticate()
99
+if result.success:
100
+ print(f"Authenticated via {result.method}")
101
+ print(f"Token: {result.access_token[:20]}...")
102
+ if result.expires_at:
103
+ import time
104
+ remaining = result.expires_at - time.time()
105
+ print(f"Expires in {remaining/60:.0f} minutes")
106
+else:
107
+ print(f"Auth failed: {result.error}")
108
+```
109
+
110
+---
111
+
112
+## OAuthManager
113
+
114
+```python
115
+from video_processor.auth import OAuthManager
116
+```
117
+
118
+Manages the full authentication lifecycle for a service. Tries auth methods in priority order and handles token persistence, refresh, and PKCE flow.
119
+
120
+### Constructor
121
+
122
+```python
123
+def __init__(self, config: AuthConfig)
124
+```
125
+
126
+| Parameter | Type | Description |
127
+|---|---|---|
128
+| `config` | `AuthConfig` | Authentication configuration for the target service |
129
+
130
+### authenticate()
131
+
132
+```python
133
+def authenticate(self) -> AuthResult
134
+```
135
+
136
+Run the full auth chain and return the result. Methods are tried in order:
137
+
138
+1. **Saved token** -- checks `~/.planopticon/{service}_token.json`, refreshes if expired
139
+2. **Client Credentials** -- if `account_id` is set and OAuth is configured, uses the client credentials grant (server-to-server)
140
+3. **OAuth PKCE** -- if OAuth is configured and client ID is available, opens a browser for interactive authorization with PKCE
141
+4. **API key** -- falls back to the environment variable specified in `api_key_env`
142
+
143
+**Returns:** `AuthResult` -- success/failure with token and method details.
144
+
145
+If all methods fail, returns an `AuthResult` with `success=False` and a helpful error message listing which environment variables to set.
146
+
147
+### get_token()
148
+
149
+```python
150
+def get_token(self) -> Optional[str]
151
+```
152
+
153
+Convenience method: run `authenticate()` and return just the access token string.
154
+
155
+**Returns:** `Optional[str]` -- the access token, or `None` if authentication failed.
156
+
157
+### clear_token()
158
+
159
+```python
160
+def clear_token(self) -> None
161
+```
162
+
163
+Remove the saved token file for this service (effectively a logout). The next `authenticate()` call will require re-authentication.
164
+
165
+---
166
+
167
+## Authentication Flows
168
+
169
+### Saved Token (auto-refresh)
170
+
171
+Tokens are saved to `~/.planopticon/{service}_token.json` as JSON. On each `authenticate()` call, the saved token is loaded and checked:
172
+
173
+- If the token has not expired (`time.time() < expires_at`), it is returned immediately
174
+- If expired but a refresh token is available, the manager attempts to refresh using the OAuth token endpoint
175
+- The refreshed token is saved back to disk
176
+
177
+### Client Credentials Grant
178
+
179
+Used for server-to-server authentication (e.g., Zoom Server-to-Server OAuth). Requires `account_id`, `client_id`, and `client_secret`. Sends a POST to the token endpoint with `grant_type=account_credentials`.
180
+
181
+### OAuth 2.0 Authorization Code with PKCE
182
+
183
+Interactive flow for user authentication:
184
+
185
+1. Generates a PKCE code verifier and S256 challenge
186
+2. Constructs the authorization URL with client ID, redirect URI, scopes, and PKCE challenge
187
+3. Opens the URL in the user's browser
188
+4. Prompts the user to paste the authorization code
189
+5. Exchanges the code for tokens at the token endpoint
190
+6. Saves the tokens to disk
191
+
192
+### API Key Fallback
193
+
194
+If no OAuth flow succeeds, falls back to checking the environment variable specified in `api_key_env`. Returns the value directly as the access token.
195
+
196
+---
197
+
198
+## KNOWN_CONFIGS
199
+
200
+```python
201
+from video_processor.auth import KNOWN_CONFIGS
202
+```
203
+
204
+Pre-built `AuthConfig` instances for supported services. These cover the most common cloud integrations and can be used directly or as templates for custom configurations.
205
+
206
+| Service Key | Service | OAuth Endpoints | Client ID Env | API Key Env |
207
+|---|---|---|---|---|
208
+| `"zoom"` | Zoom | `zoom.us/oauth/...` | `ZOOM_CLIENT_ID` | -- |
209
+| `"notion"` | Notion | `api.notion.com/v1/oauth/...` | `NOTION_CLIENT_ID` | `NOTION_API_KEY` |
210
+| `"dropbox"` | Dropbox | `dropbox.com/oauth2/...` | `DROPBOX_APP_KEY` | `DROPBOX_ACCESS_TOKEN` |
211
+| `"github"` | GitHub | `github.com/login/oauth/...` | `GITHUB_CLIENT_ID` | `GITHUB_TOKEN` |
212
+| `"google"` | Google | `accounts.google.com/o/oauth2/...` | `GOOGLE_CLIENT_ID` | `GOOGLE_API_KEY` |
213
+| `"microsoft"` | Microsoft | `login.microsoftonline.com/.../oauth2/...` | `MICROSOFT_CLIENT_ID` | -- |
214
+
215
+### Zoom
216
+
217
+Supports both Server-to-Server (via `ZOOM_ACCOUNT_ID`) and OAuth PKCE flows.
218
+
219
+```bash
220
+# Server-to-Server
221
+export ZOOM_CLIENT_ID="..."
222
+export ZOOM_CLIENT_SECRET="..."
223
+export ZOOM_ACCOUNT_ID="..."
224
+
225
+# Or interactive OAuth (omit ZOOM_ACCOUNT_ID)
226
+export ZOOM_CLIENT_ID="..."
227
+export ZOOM_CLIENT_SECRET="..."
228
+```
229
+
230
+### Google (Drive, Meet, Workspace)
231
+
232
+Supports OAuth PKCE and API key fallback. Scopes include Drive and Docs read-only access.
233
+
234
+```bash
235
+export GOOGLE_CLIENT_ID="..."
236
+export GOOGLE_CLIENT_SECRET="..."
237
+# Or for API-key-only access:
238
+export GOOGLE_API_KEY="..."
239
+```
240
+
241
+### GitHub
242
+
243
+Supports OAuth PKCE and personal access token. Requests `repo` and `read:org` scopes.
244
+
245
+```bash
246
+# OAuth
247
+export GITHUB_CLIENT_ID="..."
248
+export GITHUB_CLIENT_SECRET="..."
249
+# Or personal access token
250
+export GITHUB_TOKEN="ghp_..."
251
+```
252
+
253
+---
254
+
255
+## Helper Functions
256
+
257
+### get_auth_config()
258
+
259
+```python
260
+def get_auth_config(service: str) -> Optional[AuthConfig]
261
+```
262
+
263
+Get a pre-built `AuthConfig` for a known service.
264
+
265
+**Parameters:**
266
+
267
+| Parameter | Type | Description |
268
+|---|---|---|
269
+| `service` | `str` | Service name (e.g., `"zoom"`, `"notion"`, `"github"`) |
270
+
271
+**Returns:** `Optional[AuthConfig]` -- the config, or `None` if the service is not in `KNOWN_CONFIGS`.
272
+
273
+### get_auth_manager()
274
+
275
+```python
276
+def get_auth_manager(service: str) -> Optional[OAuthManager]
277
+```
278
+
279
+Get an `OAuthManager` for a known service. Convenience wrapper that looks up the config and creates the manager in one call.
280
+
281
+**Returns:** `Optional[OAuthManager]` -- the manager, or `None` if the service is not known.
282
+
283
+---
284
+
285
+## Usage Examples
286
+
287
+### Quick authentication for a known service
288
+
289
+```python
290
+from video_processor.auth import get_auth_manager
291
+
292
+manager = get_auth_manager("zoom")
293
+if manager:
294
+ result = manager.authenticate()
295
+ if result.success:
296
+ print(f"Authenticated via {result.method}")
297
+ # Use result.access_token for API calls
298
+ else:
299
+ print(f"Failed: {result.error}")
300
+```
301
+
302
+### Custom service configuration
303
+
304
+```python
305
+from video_processor.auth import AuthConfig, OAuthManager
306
+
307
+config = AuthConfig(
308
+ service="my_service",
309
+ oauth_authorize_url="https://my-service.com/oauth/authorize",
310
+ oauth_token_url="https://my-service.com/oauth/token",
311
+ client_id_env="MY_SERVICE_CLIENT_ID",
312
+ client_secret_env="MY_SERVICE_CLIENT_SECRET",
313
+ api_key_env="MY_SERVICE_API_KEY",
314
+ scopes=["read", "write"],
315
+)
316
+
317
+manager = OAuthManager(config)
318
+token = manager.get_token() # Returns str or None
319
+```
320
+
321
+### Using auth in a custom source connector
322
+
323
+```python
324
+from pathlib import Path
325
+from typing import List, Optional
326
+
327
+from video_processor.auth import OAuthManager, AuthConfig
328
+from video_processor.sources.base import BaseSource, SourceFile
329
+
330
+class CustomSource(BaseSource):
331
+ def __init__(self):
332
+ self._config = AuthConfig(
333
+ service="custom",
334
+ api_key_env="CUSTOM_API_KEY",
335
+ )
336
+ self._manager = OAuthManager(self._config)
337
+ self._token: Optional[str] = None
338
+
339
+ def authenticate(self) -> bool:
340
+ self._token = self._manager.get_token()
341
+ return self._token is not None
342
+
343
+ def list_videos(self, **kwargs) -> List[SourceFile]:
344
+ # Use self._token to query the API
345
+ ...
346
+
347
+ def download(self, file: SourceFile, destination: Path) -> Path:
348
+ # Use self._token for authenticated downloads
349
+ ...
350
+```
351
+
352
+### Logout / clear saved token
353
+
354
+```python
355
+from video_processor.auth import get_auth_manager
356
+
357
+manager = get_auth_manager("zoom")
358
+if manager:
359
+ manager.clear_token()
360
+ print("Zoom token cleared")
361
+```
362
+
363
+### Token storage location
364
+
365
+All tokens are stored under `~/.planopticon/`:
366
+
367
+```
368
+~/.planopticon/
369
+ zoom_token.json
370
+ notion_token.json
371
+ github_token.json
372
+ google_token.json
373
+ microsoft_token.json
374
+ dropbox_token.json
375
+```
376
+
377
+Each file contains a JSON object with `access_token`, `refresh_token` (if applicable), `expires_at`, and client credentials for refresh.
--- a/docs/api/auth.md
+++ b/docs/api/auth.md
@@ -0,0 +1,377 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/api/auth.md
+++ b/docs/api/auth.md
@@ -0,0 +1,377 @@
1 # Auth API Reference
2
3 ::: video_processor.auth
4
5 ---
6
7 ## Overview
8
9 The `video_processor.auth` module provides a unified OAuth and authentication strategy for all PlanOpticon source connectors. It supports multiple authentication methods tried in a consistent order:
10
11 1. **Saved token** -- load from disk, auto-refresh if expired
12 2. **Client Credentials** -- server-to-server OAuth (e.g., Zoom S2S)
13 3. **OAuth 2.0 PKCE** -- interactive Authorization Code flow with PKCE
14 4. **API key fallback** -- environment variable lookup
15
16 Tokens are persisted to `~/.planopticon/` and automatically refreshed on expiry.
17
18 ---
19
20 ## AuthConfig
21
22 ```python
23 from video_processor.auth import AuthConfig
24 ```
25
26 Dataclass configuring authentication for a specific service. Defines OAuth endpoints, client credentials, API key fallback, scopes, and token storage.
27
28 ### Fields
29
30 | Field | Type | Default | Description |
31 |---|---|---|---|
32 | `service` | `str` | *required* | Service identifier (e.g., `"zoom"`, `"notion"`) |
33 | `oauth_authorize_url` | `Optional[str]` | `None` | OAuth authorization endpoint URL |
34 | `oauth_token_url` | `Optional[str]` | `None` | OAuth token exchange endpoint URL |
35 | `client_id` | `Optional[str]` | `None` | OAuth client ID (direct value) |
36 | `client_secret` | `Optional[str]` | `None` | OAuth client secret (direct value) |
37 | `client_id_env` | `Optional[str]` | `None` | Environment variable for client ID |
38 | `client_secret_env` | `Optional[str]` | `None` | Environment variable for client secret |
39 | `api_key_env` | `Optional[str]` | `None` | Environment variable for API key fallback |
40 | `scopes` | `List[str]` | `[]` | OAuth scopes to request |
41 | `redirect_uri` | `str` | `"urn:ietf:wg:oauth:2.0:oob"` | Redirect URI for auth code flow |
42 | `account_id` | `Optional[str]` | `None` | Account ID for client credentials grant (direct value) |
43 | `account_id_env` | `Optional[str]` | `None` | Environment variable for account ID |
44 | `token_path` | `Optional[Path]` | `None` | Custom token storage path |
45
46 ### Resolved Properties
47
48 These properties resolve values by checking the direct field first, then falling back to the environment variable.
49
50 | Property | Return Type | Description |
51 |---|---|---|
52 | `resolved_client_id` | `Optional[str]` | Client ID from `client_id` or `os.environ[client_id_env]` |
53 | `resolved_client_secret` | `Optional[str]` | Client secret from `client_secret` or `os.environ[client_secret_env]` |
54 | `resolved_api_key` | `Optional[str]` | API key from `os.environ[api_key_env]` |
55 | `resolved_account_id` | `Optional[str]` | Account ID from `account_id` or `os.environ[account_id_env]` |
56 | `resolved_token_path` | `Path` | Token file path: `token_path` or `~/.planopticon/{service}_token.json` |
57 | `supports_oauth` | `bool` | `True` if both `oauth_authorize_url` and `oauth_token_url` are set |
58
59 ```python
60 from video_processor.auth import AuthConfig
61
62 config = AuthConfig(
63 service="notion",
64 oauth_authorize_url="https://api.notion.com/v1/oauth/authorize",
65 oauth_token_url="https://api.notion.com/v1/oauth/token",
66 client_id_env="NOTION_CLIENT_ID",
67 client_secret_env="NOTION_CLIENT_SECRET",
68 api_key_env="NOTION_API_KEY",
69 scopes=["read_content"],
70 )
71
72 # Check resolved values
73 print(config.resolved_client_id) # From NOTION_CLIENT_ID env var
74 print(config.supports_oauth) # True
75 print(config.resolved_token_path) # ~/.planopticon/notion_token.json
76 ```
77
78 ---
79
80 ## AuthResult
81
82 ```python
83 from video_processor.auth import AuthResult
84 ```
85
86 Dataclass representing the result of an authentication attempt.
87
88 | Field | Type | Default | Description |
89 |---|---|---|---|
90 | `success` | `bool` | *required* | Whether authentication succeeded |
91 | `access_token` | `Optional[str]` | `None` | The access token (if successful) |
92 | `method` | `Optional[str]` | `None` | Auth method used: `"saved_token"`, `"oauth_pkce"`, `"client_credentials"`, `"api_key"` |
93 | `expires_at` | `Optional[float]` | `None` | Token expiration as Unix timestamp |
94 | `refresh_token` | `Optional[str]` | `None` | OAuth refresh token (if available) |
95 | `error` | `Optional[str]` | `None` | Error message (if failed) |
96
97 ```python
98 result = manager.authenticate()
99 if result.success:
100 print(f"Authenticated via {result.method}")
101 print(f"Token: {result.access_token[:20]}...")
102 if result.expires_at:
103 import time
104 remaining = result.expires_at - time.time()
105 print(f"Expires in {remaining/60:.0f} minutes")
106 else:
107 print(f"Auth failed: {result.error}")
108 ```
109
110 ---
111
112 ## OAuthManager
113
114 ```python
115 from video_processor.auth import OAuthManager
116 ```
117
118 Manages the full authentication lifecycle for a service. Tries auth methods in priority order and handles token persistence, refresh, and PKCE flow.
119
120 ### Constructor
121
122 ```python
123 def __init__(self, config: AuthConfig)
124 ```
125
126 | Parameter | Type | Description |
127 |---|---|---|
128 | `config` | `AuthConfig` | Authentication configuration for the target service |
129
130 ### authenticate()
131
132 ```python
133 def authenticate(self) -> AuthResult
134 ```
135
136 Run the full auth chain and return the result. Methods are tried in order:
137
138 1. **Saved token** -- checks `~/.planopticon/{service}_token.json`, refreshes if expired
139 2. **Client Credentials** -- if `account_id` is set and OAuth is configured, uses the client credentials grant (server-to-server)
140 3. **OAuth PKCE** -- if OAuth is configured and client ID is available, opens a browser for interactive authorization with PKCE
141 4. **API key** -- falls back to the environment variable specified in `api_key_env`
142
143 **Returns:** `AuthResult` -- success/failure with token and method details.
144
145 If all methods fail, returns an `AuthResult` with `success=False` and a helpful error message listing which environment variables to set.
146
147 ### get_token()
148
149 ```python
150 def get_token(self) -> Optional[str]
151 ```
152
153 Convenience method: run `authenticate()` and return just the access token string.
154
155 **Returns:** `Optional[str]` -- the access token, or `None` if authentication failed.
156
157 ### clear_token()
158
159 ```python
160 def clear_token(self) -> None
161 ```
162
163 Remove the saved token file for this service (effectively a logout). The next `authenticate()` call will require re-authentication.
164
165 ---
166
167 ## Authentication Flows
168
169 ### Saved Token (auto-refresh)
170
171 Tokens are saved to `~/.planopticon/{service}_token.json` as JSON. On each `authenticate()` call, the saved token is loaded and checked:
172
173 - If the token has not expired (`time.time() < expires_at`), it is returned immediately
174 - If expired but a refresh token is available, the manager attempts to refresh using the OAuth token endpoint
175 - The refreshed token is saved back to disk
176
177 ### Client Credentials Grant
178
179 Used for server-to-server authentication (e.g., Zoom Server-to-Server OAuth). Requires `account_id`, `client_id`, and `client_secret`. Sends a POST to the token endpoint with `grant_type=account_credentials`.
180
181 ### OAuth 2.0 Authorization Code with PKCE
182
183 Interactive flow for user authentication:
184
185 1. Generates a PKCE code verifier and S256 challenge
186 2. Constructs the authorization URL with client ID, redirect URI, scopes, and PKCE challenge
187 3. Opens the URL in the user's browser
188 4. Prompts the user to paste the authorization code
189 5. Exchanges the code for tokens at the token endpoint
190 6. Saves the tokens to disk
191
192 ### API Key Fallback
193
194 If no OAuth flow succeeds, falls back to checking the environment variable specified in `api_key_env`. Returns the value directly as the access token.
195
196 ---
197
198 ## KNOWN_CONFIGS
199
200 ```python
201 from video_processor.auth import KNOWN_CONFIGS
202 ```
203
204 Pre-built `AuthConfig` instances for supported services. These cover the most common cloud integrations and can be used directly or as templates for custom configurations.
205
206 | Service Key | Service | OAuth Endpoints | Client ID Env | API Key Env |
207 |---|---|---|---|---|
208 | `"zoom"` | Zoom | `zoom.us/oauth/...` | `ZOOM_CLIENT_ID` | -- |
209 | `"notion"` | Notion | `api.notion.com/v1/oauth/...` | `NOTION_CLIENT_ID` | `NOTION_API_KEY` |
210 | `"dropbox"` | Dropbox | `dropbox.com/oauth2/...` | `DROPBOX_APP_KEY` | `DROPBOX_ACCESS_TOKEN` |
211 | `"github"` | GitHub | `github.com/login/oauth/...` | `GITHUB_CLIENT_ID` | `GITHUB_TOKEN` |
212 | `"google"` | Google | `accounts.google.com/o/oauth2/...` | `GOOGLE_CLIENT_ID` | `GOOGLE_API_KEY` |
213 | `"microsoft"` | Microsoft | `login.microsoftonline.com/.../oauth2/...` | `MICROSOFT_CLIENT_ID` | -- |
214
215 ### Zoom
216
217 Supports both Server-to-Server (via `ZOOM_ACCOUNT_ID`) and OAuth PKCE flows.
218
219 ```bash
220 # Server-to-Server
221 export ZOOM_CLIENT_ID="..."
222 export ZOOM_CLIENT_SECRET="..."
223 export ZOOM_ACCOUNT_ID="..."
224
225 # Or interactive OAuth (omit ZOOM_ACCOUNT_ID)
226 export ZOOM_CLIENT_ID="..."
227 export ZOOM_CLIENT_SECRET="..."
228 ```
229
230 ### Google (Drive, Meet, Workspace)
231
232 Supports OAuth PKCE and API key fallback. Scopes include Drive and Docs read-only access.
233
234 ```bash
235 export GOOGLE_CLIENT_ID="..."
236 export GOOGLE_CLIENT_SECRET="..."
237 # Or for API-key-only access:
238 export GOOGLE_API_KEY="..."
239 ```
240
241 ### GitHub
242
243 Supports OAuth PKCE and personal access token. Requests `repo` and `read:org` scopes.
244
245 ```bash
246 # OAuth
247 export GITHUB_CLIENT_ID="..."
248 export GITHUB_CLIENT_SECRET="..."
249 # Or personal access token
250 export GITHUB_TOKEN="ghp_..."
251 ```
252
253 ---
254
255 ## Helper Functions
256
257 ### get_auth_config()
258
259 ```python
260 def get_auth_config(service: str) -> Optional[AuthConfig]
261 ```
262
263 Get a pre-built `AuthConfig` for a known service.
264
265 **Parameters:**
266
267 | Parameter | Type | Description |
268 |---|---|---|
269 | `service` | `str` | Service name (e.g., `"zoom"`, `"notion"`, `"github"`) |
270
271 **Returns:** `Optional[AuthConfig]` -- the config, or `None` if the service is not in `KNOWN_CONFIGS`.
272
273 ### get_auth_manager()
274
275 ```python
276 def get_auth_manager(service: str) -> Optional[OAuthManager]
277 ```
278
279 Get an `OAuthManager` for a known service. Convenience wrapper that looks up the config and creates the manager in one call.
280
281 **Returns:** `Optional[OAuthManager]` -- the manager, or `None` if the service is not known.
282
283 ---
284
285 ## Usage Examples
286
287 ### Quick authentication for a known service
288
289 ```python
290 from video_processor.auth import get_auth_manager
291
292 manager = get_auth_manager("zoom")
293 if manager:
294 result = manager.authenticate()
295 if result.success:
296 print(f"Authenticated via {result.method}")
297 # Use result.access_token for API calls
298 else:
299 print(f"Failed: {result.error}")
300 ```
301
302 ### Custom service configuration
303
304 ```python
305 from video_processor.auth import AuthConfig, OAuthManager
306
307 config = AuthConfig(
308 service="my_service",
309 oauth_authorize_url="https://my-service.com/oauth/authorize",
310 oauth_token_url="https://my-service.com/oauth/token",
311 client_id_env="MY_SERVICE_CLIENT_ID",
312 client_secret_env="MY_SERVICE_CLIENT_SECRET",
313 api_key_env="MY_SERVICE_API_KEY",
314 scopes=["read", "write"],
315 )
316
317 manager = OAuthManager(config)
318 token = manager.get_token() # Returns str or None
319 ```
320
321 ### Using auth in a custom source connector
322
323 ```python
324 from pathlib import Path
325 from typing import List, Optional
326
327 from video_processor.auth import OAuthManager, AuthConfig
328 from video_processor.sources.base import BaseSource, SourceFile
329
330 class CustomSource(BaseSource):
331 def __init__(self):
332 self._config = AuthConfig(
333 service="custom",
334 api_key_env="CUSTOM_API_KEY",
335 )
336 self._manager = OAuthManager(self._config)
337 self._token: Optional[str] = None
338
339 def authenticate(self) -> bool:
340 self._token = self._manager.get_token()
341 return self._token is not None
342
343 def list_videos(self, **kwargs) -> List[SourceFile]:
344 # Use self._token to query the API
345 ...
346
347 def download(self, file: SourceFile, destination: Path) -> Path:
348 # Use self._token for authenticated downloads
349 ...
350 ```
351
352 ### Logout / clear saved token
353
354 ```python
355 from video_processor.auth import get_auth_manager
356
357 manager = get_auth_manager("zoom")
358 if manager:
359 manager.clear_token()
360 print("Zoom token cleared")
361 ```
362
363 ### Token storage location
364
365 All tokens are stored under `~/.planopticon/`:
366
367 ```
368 ~/.planopticon/
369 zoom_token.json
370 notion_token.json
371 github_token.json
372 google_token.json
373 microsoft_token.json
374 dropbox_token.json
375 ```
376
377 Each file contains a JSON object with `access_token`, `refresh_token` (if applicable), `expires_at`, and client credentials for refresh.
--- docs/api/models.md
+++ docs/api/models.md
@@ -1,3 +1,501 @@
11
# Models API Reference
22
33
::: video_processor.models
4
+
5
+---
6
+
7
+## Overview
8
+
9
+The `video_processor.models` module defines all Pydantic data models used throughout PlanOpticon for structured output, serialization, and validation. These models represent everything from individual transcript segments to complete batch processing manifests.
10
+
11
+All models inherit from `pydantic.BaseModel` and support JSON serialization via `.model_dump_json()` and deserialization via `.model_validate_json()`.
12
+
13
+---
14
+
15
+## Enumerations
16
+
17
+### DiagramType
18
+
19
+Types of visual content detected in video frames.
20
+
21
+```python
22
+from video_processor.models import DiagramType
23
+```
24
+
25
+| Value | Description |
26
+|---|---|
27
+| `flowchart` | Process flow or decision tree diagrams |
28
+| `sequence` | Sequence or interaction diagrams |
29
+| `architecture` | System architecture diagrams |
30
+| `whiteboard` | Whiteboard drawings or sketches |
31
+| `chart` | Data charts (bar, line, pie, scatter) |
32
+| `table` | Tabular data |
33
+| `slide` | Presentation slides |
34
+| `screenshot` | Application screenshots or screen shares |
35
+| `unknown` | Unclassified visual content |
36
+
37
+### OutputFormat
38
+
39
+Available output formats for processing results.
40
+
41
+| Value | Description |
42
+|---|---|
43
+| `markdown` | Markdown text |
44
+| `json` | JSON data |
45
+| `html` | HTML document |
46
+| `pdf` | PDF document |
47
+| `svg` | SVG vector graphic |
48
+| `png` | PNG raster image |
49
+
50
+### PlanningEntityType
51
+
52
+Classification types for entities in a planning taxonomy.
53
+
54
+| Value | Description |
55
+|---|---|
56
+| `goal` | Project goals or objectives |
57
+| `requirement` | Functional or non-functional requirements |
58
+| `constraint` | Limitations or constraints |
59
+| `decision` | Decisions made during planning |
60
+| `risk` | Identified risks |
61
+| `assumption` | Planning assumptions |
62
+| `dependency` | External or internal dependencies |
63
+| `milestone` | Project milestones |
64
+| `task` | Actionable tasks |
65
+| `feature` | Product features |
66
+
67
+### PlanningRelationshipType
68
+
69
+Relationship types within a planning taxonomy.
70
+
71
+| Value | Description |
72
+|---|---|
73
+| `requires` | Entity A requires entity B |
74
+| `blocked_by` | Entity A is blocked by entity B |
75
+| `has_risk` | Entity A has an associated risk B |
76
+| `depends_on` | Entity A depends on entity B |
77
+| `addresses` | Entity A addresses entity B |
78
+| `has_tradeoff` | Entity A involves a tradeoff with entity B |
79
+| `delivers` | Entity A delivers entity B |
80
+| `implements` | Entity A implements entity B |
81
+| `parent_of` | Entity A is the parent of entity B |
82
+
83
+---
84
+
85
+## Protocols
86
+
87
+### ProgressCallback
88
+
89
+A runtime-checkable protocol for receiving pipeline progress updates. Implement this interface to integrate custom progress reporting (e.g., web UI, logging).
90
+
91
+```python
92
+from video_processor.models import ProgressCallback
93
+
94
+class MyProgress:
95
+ def on_step_start(self, step: str, index: int, total: int) -> None:
96
+ print(f"Starting {step} ({index}/{total})")
97
+
98
+ def on_step_complete(self, step: str, index: int, total: int) -> None:
99
+ print(f"Completed {step} ({index}/{total})")
100
+
101
+ def on_progress(self, step: str, percent: float, message: str = "") -> None:
102
+ print(f"{step}: {percent:.0f}% {message}")
103
+
104
+assert isinstance(MyProgress(), ProgressCallback) # True
105
+```
106
+
107
+**Methods:**
108
+
109
+| Method | Parameters | Description |
110
+|---|---|---|
111
+| `on_step_start` | `step: str`, `index: int`, `total: int` | Called when a pipeline step begins |
112
+| `on_step_complete` | `step: str`, `index: int`, `total: int` | Called when a pipeline step finishes |
113
+| `on_progress` | `step: str`, `percent: float`, `message: str` | Called with incremental progress updates |
114
+
115
+---
116
+
117
+## Transcript Models
118
+
119
+### TranscriptSegment
120
+
121
+A single segment of transcribed audio with timing and optional speaker identification.
122
+
123
+| Field | Type | Default | Description |
124
+|---|---|---|---|
125
+| `start` | `float` | *required* | Start time in seconds |
126
+| `end` | `float` | *required* | End time in seconds |
127
+| `text` | `str` | *required* | Transcribed text content |
128
+| `speaker` | `Optional[str]` | `None` | Speaker identifier (e.g., "Speaker 1") |
129
+| `confidence` | `Optional[float]` | `None` | Transcription confidence score (0.0 to 1.0) |
130
+
131
+```json
132
+{
133
+ "start": 12.5,
134
+ "end": 15.3,
135
+ "text": "We should migrate to the new API by next quarter.",
136
+ "speaker": "Alice",
137
+ "confidence": 0.95
138
+}
139
+```
140
+
141
+---
142
+
143
+## Content Extraction Models
144
+
145
+### ActionItem
146
+
147
+An action item extracted from transcript or diagram content.
148
+
149
+| Field | Type | Default | Description |
150
+|---|---|---|---|
151
+| `action` | `str` | *required* | The action to be taken |
152
+| `assignee` | `Optional[str]` | `None` | Person responsible for the action |
153
+| `deadline` | `Optional[str]` | `None` | Deadline or timeframe |
154
+| `priority` | `Optional[str]` | `None` | Priority level (e.g., "high", "medium", "low") |
155
+| `context` | `Optional[str]` | `None` | Additional context or notes |
156
+| `source` | `Optional[str]` | `None` | Where this was found: `"transcript"`, `"diagram"`, or `"both"` |
157
+
158
+```json
159
+{
160
+ "action": "Migrate authentication service to OAuth 2.0",
161
+ "assignee": "Bob",
162
+ "deadline": "Q2 2026",
163
+ "priority": "high",
164
+ "context": "at 245s",
165
+ "source": "transcript"
166
+}
167
+```
168
+
169
+### KeyPoint
170
+
171
+A key point extracted from content, optionally linked to diagrams.
172
+
173
+| Field | Type | Default | Description |
174
+|---|---|---|---|
175
+| `point` | `str` | *required* | The key point text |
176
+| `topic` | `Optional[str]` | `None` | Topic or category |
177
+| `details` | `Optional[str]` | `None` | Supporting details |
178
+| `timestamp` | `Optional[float]` | `None` | Timestamp in video (seconds) |
179
+| `source` | `Optional[str]` | `None` | Where this was found |
180
+| `related_diagrams` | `List[int]` | `[]` | Indices of related diagrams in the manifest |
181
+
182
+```json
183
+{
184
+ "point": "Team decided to use FalkorDB for graph storage",
185
+ "topic": "Architecture",
186
+ "details": "Embedded database avoids infrastructure overhead for CLI use",
187
+ "timestamp": 342.0,
188
+ "source": "transcript",
189
+ "related_diagrams": [0, 2]
190
+}
191
+```
192
+
193
+---
194
+
195
+## Diagram Models
196
+
197
+### DiagramResult
198
+
199
+Result from diagram extraction and analysis. Contains structured data extracted from visual content, along with paths to output files.
200
+
201
+| Field | Type | Default | Description |
202
+|---|---|---|---|
203
+| `frame_index` | `int` | *required* | Index of the source frame |
204
+| `timestamp` | `Optional[float]` | `None` | Timestamp in video (seconds) |
205
+| `diagram_type` | `DiagramType` | `unknown` | Type of diagram detected |
206
+| `confidence` | `float` | `0.0` | Detection confidence (0.0 to 1.0) |
207
+| `description` | `Optional[str]` | `None` | Detailed description of the diagram |
208
+| `text_content` | `Optional[str]` | `None` | All visible text, preserving structure |
209
+| `elements` | `List[str]` | `[]` | Identified elements or components |
210
+| `relationships` | `List[str]` | `[]` | Identified relationships (e.g., `"A -> B: connects"`) |
211
+| `mermaid` | `Optional[str]` | `None` | Mermaid syntax representation |
212
+| `chart_data` | `Optional[Dict[str, Any]]` | `None` | Extractable chart data (`labels`, `values`, `chart_type`) |
213
+| `image_path` | `Optional[str]` | `None` | Relative path to original frame image |
214
+| `svg_path` | `Optional[str]` | `None` | Relative path to rendered SVG |
215
+| `png_path` | `Optional[str]` | `None` | Relative path to rendered PNG |
216
+| `mermaid_path` | `Optional[str]` | `None` | Relative path to mermaid source file |
217
+
218
+```json
219
+{
220
+ "frame_index": 5,
221
+ "timestamp": 120.0,
222
+ "diagram_type": "architecture",
223
+ "confidence": 0.92,
224
+ "description": "Microservices architecture showing API gateway, auth service, and database layer",
225
+ "text_content": "API Gateway\nAuth Service\nUser DB\nPostgreSQL",
226
+ "elements": ["API Gateway", "Auth Service", "User DB", "PostgreSQL"],
227
+ "relationships": ["API Gateway -> Auth Service: authenticates", "Auth Service -> User DB: queries"],
228
+ "mermaid": "graph LR\n A[API Gateway] --> B[Auth Service]\n B --> C[User DB]",
229
+ "chart_data": null,
230
+ "image_path": "diagrams/diagram_0.jpg",
231
+ "svg_path": null,
232
+ "png_path": null,
233
+ "mermaid_path": "diagrams/diagram_0.mermaid"
234
+}
235
+```
236
+
237
+### ScreenCapture
238
+
239
+A screengrab fallback created when diagram extraction fails or confidence is too low for full analysis.
240
+
241
+| Field | Type | Default | Description |
242
+|---|---|---|---|
243
+| `frame_index` | `int` | *required* | Index of the source frame |
244
+| `timestamp` | `Optional[float]` | `None` | Timestamp in video (seconds) |
245
+| `caption` | `Optional[str]` | `None` | Brief description of the content |
246
+| `image_path` | `Optional[str]` | `None` | Relative path to screenshot image |
247
+| `confidence` | `float` | `0.0` | Detection confidence that triggered fallback |
248
+
249
+```json
250
+{
251
+ "frame_index": 8,
252
+ "timestamp": 195.0,
253
+ "caption": "Code editor showing a Python function definition",
254
+ "image_path": "captures/capture_0.jpg",
255
+ "confidence": 0.45
256
+}
257
+```
258
+
259
+---
260
+
261
+## Knowledge Graph Models
262
+
263
+### Entity
264
+
265
+An entity in the knowledge graph, representing a person, concept, technology, or other named item extracted from content.
266
+
267
+| Field | Type | Default | Description |
268
+|---|---|---|---|
269
+| `name` | `str` | *required* | Entity name |
270
+| `type` | `str` | `"concept"` | Entity type: `"person"`, `"concept"`, `"technology"`, `"time"`, `"diagram"` |
271
+| `descriptions` | `List[str]` | `[]` | Accumulated descriptions of this entity |
272
+| `source` | `Optional[str]` | `None` | Source attribution: `"transcript"`, `"diagram"`, or `"both"` |
273
+| `occurrences` | `List[Dict[str, Any]]` | `[]` | Occurrences with source, timestamp, and text context |
274
+
275
+```json
276
+{
277
+ "name": "FalkorDB",
278
+ "type": "technology",
279
+ "descriptions": ["Embedded graph database", "Supports Cypher queries"],
280
+ "source": "both",
281
+ "occurrences": [
282
+ {"source": "transcript", "timestamp": 120.0, "text": "We chose FalkorDB for graph storage"},
283
+ {"source": "diagram", "text": "FalkorDB Lite"}
284
+ ]
285
+}
286
+```
287
+
288
+### Relationship
289
+
290
+A directed relationship between two entities in the knowledge graph.
291
+
292
+| Field | Type | Default | Description |
293
+|---|---|---|---|
294
+| `source` | `str` | *required* | Source entity name |
295
+| `target` | `str` | *required* | Target entity name |
296
+| `type` | `str` | `"related_to"` | Relationship type (e.g., `"uses"`, `"manages"`, `"related_to"`) |
297
+| `content_source` | `Optional[str]` | `None` | Content source identifier |
298
+| `timestamp` | `Optional[float]` | `None` | Timestamp in seconds |
299
+
300
+```json
301
+{
302
+ "source": "PlanOpticon",
303
+ "target": "FalkorDB",
304
+ "type": "uses",
305
+ "content_source": "transcript",
306
+ "timestamp": 125.0
307
+}
308
+```
309
+
310
+### SourceRecord
311
+
312
+A content source registered in the knowledge graph for provenance tracking.
313
+
314
+| Field | Type | Default | Description |
315
+|---|---|---|---|
316
+| `source_id` | `str` | *required* | Unique identifier for this source |
317
+| `source_type` | `str` | *required* | Source type: `"video"`, `"document"`, `"url"`, `"api"`, `"manual"` |
318
+| `title` | `str` | *required* | Human-readable title |
319
+| `path` | `Optional[str]` | `None` | Local file path |
320
+| `url` | `Optional[str]` | `None` | URL if applicable |
321
+| `mime_type` | `Optional[str]` | `None` | MIME type of the source |
322
+| `ingested_at` | `str` | *auto* | ISO format ingestion timestamp (auto-generated) |
323
+| `metadata` | `Dict[str, Any]` | `{}` | Additional source metadata |
324
+
325
+```json
326
+{
327
+ "source_id": "vid_abc123",
328
+ "source_type": "video",
329
+ "title": "Sprint Planning Meeting - Jan 15",
330
+ "path": "/recordings/sprint-planning.mp4",
331
+ "url": null,
332
+ "mime_type": "video/mp4",
333
+ "ingested_at": "2026-01-15T10:30:00",
334
+ "metadata": {"duration": 3600, "resolution": "1920x1080"}
335
+}
336
+```
337
+
338
+### KnowledgeGraphData
339
+
340
+Serializable knowledge graph data containing all nodes, relationships, and source provenance.
341
+
342
+| Field | Type | Default | Description |
343
+|---|---|---|---|
344
+| `nodes` | `List[Entity]` | `[]` | Graph nodes/entities |
345
+| `relationships` | `List[Relationship]` | `[]` | Graph relationships |
346
+| `sources` | `List[SourceRecord]` | `[]` | Content sources for provenance tracking |
347
+
348
+---
349
+
350
+## Planning Models
351
+
352
+### PlanningEntity
353
+
354
+An entity classified for planning purposes, with priority and status tracking.
355
+
356
+| Field | Type | Default | Description |
357
+|---|---|---|---|
358
+| `name` | `str` | *required* | Entity name |
359
+| `planning_type` | `PlanningEntityType` | *required* | Planning classification |
360
+| `description` | `str` | `""` | Detailed description |
361
+| `priority` | `Optional[str]` | `None` | Priority: `"high"`, `"medium"`, `"low"` |
362
+| `status` | `Optional[str]` | `None` | Status: `"identified"`, `"confirmed"`, `"resolved"` |
363
+| `source_entities` | `List[str]` | `[]` | Names of source KG entities this was derived from |
364
+| `metadata` | `Dict[str, Any]` | `{}` | Additional metadata |
365
+
366
+```json
367
+{
368
+ "name": "Migrate to OAuth 2.0",
369
+ "planning_type": "task",
370
+ "description": "Replace custom auth with OAuth 2.0 across all services",
371
+ "priority": "high",
372
+ "status": "identified",
373
+ "source_entities": ["OAuth", "Authentication Service"],
374
+ "metadata": {}
375
+}
376
+```
377
+
378
+---
379
+
380
+## Processing and Metadata Models
381
+
382
+### ProcessingStats
383
+
384
+Statistics about a processing run, including model usage tracking.
385
+
386
+| Field | Type | Default | Description |
387
+|---|---|---|---|
388
+| `start_time` | `Optional[str]` | `None` | ISO format start time |
389
+| `end_time` | `Optional[str]` | `None` | ISO format end time |
390
+| `duration_seconds` | `Optional[float]` | `None` | Total processing time |
391
+| `frames_extracted` | `int` | `0` | Number of frames extracted from video |
392
+| `people_frames_filtered` | `int` | `0` | Frames filtered out (contained people/webcam) |
393
+| `diagrams_detected` | `int` | `0` | Number of diagrams detected |
394
+| `screen_captures` | `int` | `0` | Number of screen captures saved |
395
+| `transcript_duration_seconds` | `Optional[float]` | `None` | Duration of transcribed audio |
396
+| `models_used` | `Dict[str, str]` | `{}` | Map of task to model used (e.g., `{"vision": "gpt-4o"}`) |
397
+
398
+### VideoMetadata
399
+
400
+Metadata about the source video file.
401
+
402
+| Field | Type | Default | Description |
403
+|---|---|---|---|
404
+| `title` | `str` | *required* | Video title |
405
+| `source_path` | `Optional[str]` | `None` | Original video file path |
406
+| `duration_seconds` | `Optional[float]` | `None` | Video duration in seconds |
407
+| `resolution` | `Optional[str]` | `None` | Video resolution (e.g., `"1920x1080"`) |
408
+| `processed_at` | `str` | *auto* | ISO format processing timestamp |
409
+
410
+---
411
+
412
+## Manifest Models
413
+
414
+### VideoManifest
415
+
416
+The single source of truth for a video processing run. Contains all output paths, inline structured data, and processing statistics.
417
+
418
+| Field | Type | Default | Description |
419
+|---|---|---|---|
420
+| `version` | `str` | `"1.0"` | Manifest schema version |
421
+| `video` | `VideoMetadata` | *required* | Source video metadata |
422
+| `stats` | `ProcessingStats` | *default* | Processing statistics |
423
+| `transcript_json` | `Optional[str]` | `None` | Relative path to transcript JSON |
424
+| `transcript_txt` | `Optional[str]` | `None` | Relative path to transcript text |
425
+| `transcript_srt` | `Optional[str]` | `None` | Relative path to SRT subtitles |
426
+| `analysis_md` | `Optional[str]` | `None` | Relative path to analysis Markdown |
427
+| `analysis_html` | `Optional[str]` | `None` | Relative path to analysis HTML |
428
+| `analysis_pdf` | `Optional[str]` | `None` | Relative path to analysis PDF |
429
+| `knowledge_graph_json` | `Optional[str]` | `None` | Relative path to knowledge graph JSON |
430
+| `knowledge_graph_db` | `Optional[str]` | `None` | Relative path to knowledge graph DB |
431
+| `key_points_json` | `Optional[str]` | `None` | Relative path to key points JSON |
432
+| `action_items_json` | `Optional[str]` | `None` | Relative path to action items JSON |
433
+| `key_points` | `List[KeyPoint]` | `[]` | Inline key points data |
434
+| `action_items` | `List[ActionItem]` | `[]` | Inline action items data |
435
+| `diagrams` | `List[DiagramResult]` | `[]` | Inline diagram results |
436
+| `screen_captures` | `List[ScreenCapture]` | `[]` | Inline screen captures |
437
+| `frame_paths` | `List[str]` | `[]` | Relative paths to extracted frames |
438
+
439
+```python
440
+from video_processor.models import VideoManifest, VideoMetadata
441
+
442
+manifest = VideoManifest(
443
+ video=VideoMetadata(title="Sprint Planning"),
444
+ key_points=[...],
445
+ action_items=[...],
446
+ diagrams=[...],
447
+)
448
+
449
+# Serialize to JSON
450
+manifest.model_dump_json(indent=2)
451
+
452
+# Load from file
453
+loaded = VideoManifest.model_validate_json(Path("manifest.json").read_text())
454
+```
455
+
456
+### BatchVideoEntry
457
+
458
+Summary of a single video within a batch processing run.
459
+
460
+| Field | Type | Default | Description |
461
+|---|---|---|---|
462
+| `video_name` | `str` | *required* | Video file name |
463
+| `manifest_path` | `str` | *required* | Relative path to the video's manifest file |
464
+| `status` | `str` | `"pending"` | Processing status: `"pending"`, `"completed"`, `"failed"` |
465
+| `error` | `Optional[str]` | `None` | Error message if processing failed |
466
+| `diagrams_count` | `int` | `0` | Number of diagrams detected |
467
+| `action_items_count` | `int` | `0` | Number of action items extracted |
468
+| `key_points_count` | `int` | `0` | Number of key points extracted |
469
+| `duration_seconds` | `Optional[float]` | `None` | Processing duration |
470
+
471
+### BatchManifest
472
+
473
+Manifest for a batch processing run across multiple videos.
474
+
475
+| Field | Type | Default | Description |
476
+|---|---|---|---|
477
+| `version` | `str` | `"1.0"` | Manifest schema version |
478
+| `title` | `str` | `"Batch Processing Results"` | Batch title |
479
+| `processed_at` | `str` | *auto* | ISO format timestamp |
480
+| `stats` | `ProcessingStats` | *default* | Aggregated processing statistics |
481
+| `videos` | `List[BatchVideoEntry]` | `[]` | Per-video summaries |
482
+| `total_videos` | `int` | `0` | Total number of videos in batch |
483
+| `completed_videos` | `int` | `0` | Successfully processed videos |
484
+| `failed_videos` | `int` | `0` | Videos that failed processing |
485
+| `total_diagrams` | `int` | `0` | Total diagrams across all videos |
486
+| `total_action_items` | `int` | `0` | Total action items across all videos |
487
+| `total_key_points` | `int` | `0` | Total key points across all videos |
488
+| `batch_summary_md` | `Optional[str]` | `None` | Relative path to batch summary Markdown |
489
+| `merged_knowledge_graph_json` | `Optional[str]` | `None` | Relative path to merged KG JSON |
490
+| `merged_knowledge_graph_db` | `Optional[str]` | `None` | Relative path to merged KG database |
491
+
492
+```python
493
+from video_processor.models import BatchManifest
494
+
495
+batch = BatchManifest(
496
+ title="Weekly Recordings",
497
+ total_videos=5,
498
+ completed_videos=4,
499
+ failed_videos=1,
500
+)
501
+```
4502
--- docs/api/models.md
+++ docs/api/models.md
@@ -1,3 +1,501 @@
1 # Models API Reference
2
3 ::: video_processor.models
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
--- docs/api/models.md
+++ docs/api/models.md
@@ -1,3 +1,501 @@
1 # Models API Reference
2
3 ::: video_processor.models
4
5 ---
6
7 ## Overview
8
9 The `video_processor.models` module defines all Pydantic data models used throughout PlanOpticon for structured output, serialization, and validation. These models represent everything from individual transcript segments to complete batch processing manifests.
10
11 All models inherit from `pydantic.BaseModel` and support JSON serialization via `.model_dump_json()` and deserialization via `.model_validate_json()`.
12
13 ---
14
15 ## Enumerations
16
17 ### DiagramType
18
19 Types of visual content detected in video frames.
20
21 ```python
22 from video_processor.models import DiagramType
23 ```
24
25 | Value | Description |
26 |---|---|
27 | `flowchart` | Process flow or decision tree diagrams |
28 | `sequence` | Sequence or interaction diagrams |
29 | `architecture` | System architecture diagrams |
30 | `whiteboard` | Whiteboard drawings or sketches |
31 | `chart` | Data charts (bar, line, pie, scatter) |
32 | `table` | Tabular data |
33 | `slide` | Presentation slides |
34 | `screenshot` | Application screenshots or screen shares |
35 | `unknown` | Unclassified visual content |
36
37 ### OutputFormat
38
39 Available output formats for processing results.
40
41 | Value | Description |
42 |---|---|
43 | `markdown` | Markdown text |
44 | `json` | JSON data |
45 | `html` | HTML document |
46 | `pdf` | PDF document |
47 | `svg` | SVG vector graphic |
48 | `png` | PNG raster image |
49
50 ### PlanningEntityType
51
52 Classification types for entities in a planning taxonomy.
53
54 | Value | Description |
55 |---|---|
56 | `goal` | Project goals or objectives |
57 | `requirement` | Functional or non-functional requirements |
58 | `constraint` | Limitations or constraints |
59 | `decision` | Decisions made during planning |
60 | `risk` | Identified risks |
61 | `assumption` | Planning assumptions |
62 | `dependency` | External or internal dependencies |
63 | `milestone` | Project milestones |
64 | `task` | Actionable tasks |
65 | `feature` | Product features |
66
67 ### PlanningRelationshipType
68
69 Relationship types within a planning taxonomy.
70
71 | Value | Description |
72 |---|---|
73 | `requires` | Entity A requires entity B |
74 | `blocked_by` | Entity A is blocked by entity B |
75 | `has_risk` | Entity A has an associated risk B |
76 | `depends_on` | Entity A depends on entity B |
77 | `addresses` | Entity A addresses entity B |
78 | `has_tradeoff` | Entity A involves a tradeoff with entity B |
79 | `delivers` | Entity A delivers entity B |
80 | `implements` | Entity A implements entity B |
81 | `parent_of` | Entity A is the parent of entity B |
82
83 ---
84
85 ## Protocols
86
87 ### ProgressCallback
88
89 A runtime-checkable protocol for receiving pipeline progress updates. Implement this interface to integrate custom progress reporting (e.g., web UI, logging).
90
91 ```python
92 from video_processor.models import ProgressCallback
93
94 class MyProgress:
95 def on_step_start(self, step: str, index: int, total: int) -> None:
96 print(f"Starting {step} ({index}/{total})")
97
98 def on_step_complete(self, step: str, index: int, total: int) -> None:
99 print(f"Completed {step} ({index}/{total})")
100
101 def on_progress(self, step: str, percent: float, message: str = "") -> None:
102 print(f"{step}: {percent:.0f}% {message}")
103
104 assert isinstance(MyProgress(), ProgressCallback) # True
105 ```
106
107 **Methods:**
108
109 | Method | Parameters | Description |
110 |---|---|---|
111 | `on_step_start` | `step: str`, `index: int`, `total: int` | Called when a pipeline step begins |
112 | `on_step_complete` | `step: str`, `index: int`, `total: int` | Called when a pipeline step finishes |
113 | `on_progress` | `step: str`, `percent: float`, `message: str` | Called with incremental progress updates |
114
115 ---
116
117 ## Transcript Models
118
119 ### TranscriptSegment
120
121 A single segment of transcribed audio with timing and optional speaker identification.
122
123 | Field | Type | Default | Description |
124 |---|---|---|---|
125 | `start` | `float` | *required* | Start time in seconds |
126 | `end` | `float` | *required* | End time in seconds |
127 | `text` | `str` | *required* | Transcribed text content |
128 | `speaker` | `Optional[str]` | `None` | Speaker identifier (e.g., "Speaker 1") |
129 | `confidence` | `Optional[float]` | `None` | Transcription confidence score (0.0 to 1.0) |
130
131 ```json
132 {
133 "start": 12.5,
134 "end": 15.3,
135 "text": "We should migrate to the new API by next quarter.",
136 "speaker": "Alice",
137 "confidence": 0.95
138 }
139 ```
140
141 ---
142
143 ## Content Extraction Models
144
145 ### ActionItem
146
147 An action item extracted from transcript or diagram content.
148
149 | Field | Type | Default | Description |
150 |---|---|---|---|
151 | `action` | `str` | *required* | The action to be taken |
152 | `assignee` | `Optional[str]` | `None` | Person responsible for the action |
153 | `deadline` | `Optional[str]` | `None` | Deadline or timeframe |
154 | `priority` | `Optional[str]` | `None` | Priority level (e.g., "high", "medium", "low") |
155 | `context` | `Optional[str]` | `None` | Additional context or notes |
156 | `source` | `Optional[str]` | `None` | Where this was found: `"transcript"`, `"diagram"`, or `"both"` |
157
158 ```json
159 {
160 "action": "Migrate authentication service to OAuth 2.0",
161 "assignee": "Bob",
162 "deadline": "Q2 2026",
163 "priority": "high",
164 "context": "at 245s",
165 "source": "transcript"
166 }
167 ```
168
169 ### KeyPoint
170
171 A key point extracted from content, optionally linked to diagrams.
172
173 | Field | Type | Default | Description |
174 |---|---|---|---|
175 | `point` | `str` | *required* | The key point text |
176 | `topic` | `Optional[str]` | `None` | Topic or category |
177 | `details` | `Optional[str]` | `None` | Supporting details |
178 | `timestamp` | `Optional[float]` | `None` | Timestamp in video (seconds) |
179 | `source` | `Optional[str]` | `None` | Where this was found |
180 | `related_diagrams` | `List[int]` | `[]` | Indices of related diagrams in the manifest |
181
182 ```json
183 {
184 "point": "Team decided to use FalkorDB for graph storage",
185 "topic": "Architecture",
186 "details": "Embedded database avoids infrastructure overhead for CLI use",
187 "timestamp": 342.0,
188 "source": "transcript",
189 "related_diagrams": [0, 2]
190 }
191 ```
192
193 ---
194
195 ## Diagram Models
196
197 ### DiagramResult
198
199 Result from diagram extraction and analysis. Contains structured data extracted from visual content, along with paths to output files.
200
201 | Field | Type | Default | Description |
202 |---|---|---|---|
203 | `frame_index` | `int` | *required* | Index of the source frame |
204 | `timestamp` | `Optional[float]` | `None` | Timestamp in video (seconds) |
205 | `diagram_type` | `DiagramType` | `unknown` | Type of diagram detected |
206 | `confidence` | `float` | `0.0` | Detection confidence (0.0 to 1.0) |
207 | `description` | `Optional[str]` | `None` | Detailed description of the diagram |
208 | `text_content` | `Optional[str]` | `None` | All visible text, preserving structure |
209 | `elements` | `List[str]` | `[]` | Identified elements or components |
210 | `relationships` | `List[str]` | `[]` | Identified relationships (e.g., `"A -> B: connects"`) |
211 | `mermaid` | `Optional[str]` | `None` | Mermaid syntax representation |
212 | `chart_data` | `Optional[Dict[str, Any]]` | `None` | Extractable chart data (`labels`, `values`, `chart_type`) |
213 | `image_path` | `Optional[str]` | `None` | Relative path to original frame image |
214 | `svg_path` | `Optional[str]` | `None` | Relative path to rendered SVG |
215 | `png_path` | `Optional[str]` | `None` | Relative path to rendered PNG |
216 | `mermaid_path` | `Optional[str]` | `None` | Relative path to mermaid source file |
217
218 ```json
219 {
220 "frame_index": 5,
221 "timestamp": 120.0,
222 "diagram_type": "architecture",
223 "confidence": 0.92,
224 "description": "Microservices architecture showing API gateway, auth service, and database layer",
225 "text_content": "API Gateway\nAuth Service\nUser DB\nPostgreSQL",
226 "elements": ["API Gateway", "Auth Service", "User DB", "PostgreSQL"],
227 "relationships": ["API Gateway -> Auth Service: authenticates", "Auth Service -> User DB: queries"],
228 "mermaid": "graph LR\n A[API Gateway] --> B[Auth Service]\n B --> C[User DB]",
229 "chart_data": null,
230 "image_path": "diagrams/diagram_0.jpg",
231 "svg_path": null,
232 "png_path": null,
233 "mermaid_path": "diagrams/diagram_0.mermaid"
234 }
235 ```
236
237 ### ScreenCapture
238
239 A screengrab fallback created when diagram extraction fails or confidence is too low for full analysis.
240
241 | Field | Type | Default | Description |
242 |---|---|---|---|
243 | `frame_index` | `int` | *required* | Index of the source frame |
244 | `timestamp` | `Optional[float]` | `None` | Timestamp in video (seconds) |
245 | `caption` | `Optional[str]` | `None` | Brief description of the content |
246 | `image_path` | `Optional[str]` | `None` | Relative path to screenshot image |
247 | `confidence` | `float` | `0.0` | Detection confidence that triggered fallback |
248
249 ```json
250 {
251 "frame_index": 8,
252 "timestamp": 195.0,
253 "caption": "Code editor showing a Python function definition",
254 "image_path": "captures/capture_0.jpg",
255 "confidence": 0.45
256 }
257 ```
258
259 ---
260
261 ## Knowledge Graph Models
262
263 ### Entity
264
265 An entity in the knowledge graph, representing a person, concept, technology, or other named item extracted from content.
266
267 | Field | Type | Default | Description |
268 |---|---|---|---|
269 | `name` | `str` | *required* | Entity name |
270 | `type` | `str` | `"concept"` | Entity type: `"person"`, `"concept"`, `"technology"`, `"time"`, `"diagram"` |
271 | `descriptions` | `List[str]` | `[]` | Accumulated descriptions of this entity |
272 | `source` | `Optional[str]` | `None` | Source attribution: `"transcript"`, `"diagram"`, or `"both"` |
273 | `occurrences` | `List[Dict[str, Any]]` | `[]` | Occurrences with source, timestamp, and text context |
274
275 ```json
276 {
277 "name": "FalkorDB",
278 "type": "technology",
279 "descriptions": ["Embedded graph database", "Supports Cypher queries"],
280 "source": "both",
281 "occurrences": [
282 {"source": "transcript", "timestamp": 120.0, "text": "We chose FalkorDB for graph storage"},
283 {"source": "diagram", "text": "FalkorDB Lite"}
284 ]
285 }
286 ```
287
288 ### Relationship
289
290 A directed relationship between two entities in the knowledge graph.
291
292 | Field | Type | Default | Description |
293 |---|---|---|---|
294 | `source` | `str` | *required* | Source entity name |
295 | `target` | `str` | *required* | Target entity name |
296 | `type` | `str` | `"related_to"` | Relationship type (e.g., `"uses"`, `"manages"`, `"related_to"`) |
297 | `content_source` | `Optional[str]` | `None` | Content source identifier |
298 | `timestamp` | `Optional[float]` | `None` | Timestamp in seconds |
299
300 ```json
301 {
302 "source": "PlanOpticon",
303 "target": "FalkorDB",
304 "type": "uses",
305 "content_source": "transcript",
306 "timestamp": 125.0
307 }
308 ```
309
310 ### SourceRecord
311
312 A content source registered in the knowledge graph for provenance tracking.
313
314 | Field | Type | Default | Description |
315 |---|---|---|---|
316 | `source_id` | `str` | *required* | Unique identifier for this source |
317 | `source_type` | `str` | *required* | Source type: `"video"`, `"document"`, `"url"`, `"api"`, `"manual"` |
318 | `title` | `str` | *required* | Human-readable title |
319 | `path` | `Optional[str]` | `None` | Local file path |
320 | `url` | `Optional[str]` | `None` | URL if applicable |
321 | `mime_type` | `Optional[str]` | `None` | MIME type of the source |
322 | `ingested_at` | `str` | *auto* | ISO format ingestion timestamp (auto-generated) |
323 | `metadata` | `Dict[str, Any]` | `{}` | Additional source metadata |
324
325 ```json
326 {
327 "source_id": "vid_abc123",
328 "source_type": "video",
329 "title": "Sprint Planning Meeting - Jan 15",
330 "path": "/recordings/sprint-planning.mp4",
331 "url": null,
332 "mime_type": "video/mp4",
333 "ingested_at": "2026-01-15T10:30:00",
334 "metadata": {"duration": 3600, "resolution": "1920x1080"}
335 }
336 ```
337
338 ### KnowledgeGraphData
339
340 Serializable knowledge graph data containing all nodes, relationships, and source provenance.
341
342 | Field | Type | Default | Description |
343 |---|---|---|---|
344 | `nodes` | `List[Entity]` | `[]` | Graph nodes/entities |
345 | `relationships` | `List[Relationship]` | `[]` | Graph relationships |
346 | `sources` | `List[SourceRecord]` | `[]` | Content sources for provenance tracking |
347
348 ---
349
350 ## Planning Models
351
352 ### PlanningEntity
353
354 An entity classified for planning purposes, with priority and status tracking.
355
356 | Field | Type | Default | Description |
357 |---|---|---|---|
358 | `name` | `str` | *required* | Entity name |
359 | `planning_type` | `PlanningEntityType` | *required* | Planning classification |
360 | `description` | `str` | `""` | Detailed description |
361 | `priority` | `Optional[str]` | `None` | Priority: `"high"`, `"medium"`, `"low"` |
362 | `status` | `Optional[str]` | `None` | Status: `"identified"`, `"confirmed"`, `"resolved"` |
363 | `source_entities` | `List[str]` | `[]` | Names of source KG entities this was derived from |
364 | `metadata` | `Dict[str, Any]` | `{}` | Additional metadata |
365
366 ```json
367 {
368 "name": "Migrate to OAuth 2.0",
369 "planning_type": "task",
370 "description": "Replace custom auth with OAuth 2.0 across all services",
371 "priority": "high",
372 "status": "identified",
373 "source_entities": ["OAuth", "Authentication Service"],
374 "metadata": {}
375 }
376 ```
377
378 ---
379
380 ## Processing and Metadata Models
381
382 ### ProcessingStats
383
384 Statistics about a processing run, including model usage tracking.
385
386 | Field | Type | Default | Description |
387 |---|---|---|---|
388 | `start_time` | `Optional[str]` | `None` | ISO format start time |
389 | `end_time` | `Optional[str]` | `None` | ISO format end time |
390 | `duration_seconds` | `Optional[float]` | `None` | Total processing time |
391 | `frames_extracted` | `int` | `0` | Number of frames extracted from video |
392 | `people_frames_filtered` | `int` | `0` | Frames filtered out (contained people/webcam) |
393 | `diagrams_detected` | `int` | `0` | Number of diagrams detected |
394 | `screen_captures` | `int` | `0` | Number of screen captures saved |
395 | `transcript_duration_seconds` | `Optional[float]` | `None` | Duration of transcribed audio |
396 | `models_used` | `Dict[str, str]` | `{}` | Map of task to model used (e.g., `{"vision": "gpt-4o"}`) |
397
398 ### VideoMetadata
399
400 Metadata about the source video file.
401
402 | Field | Type | Default | Description |
403 |---|---|---|---|
404 | `title` | `str` | *required* | Video title |
405 | `source_path` | `Optional[str]` | `None` | Original video file path |
406 | `duration_seconds` | `Optional[float]` | `None` | Video duration in seconds |
407 | `resolution` | `Optional[str]` | `None` | Video resolution (e.g., `"1920x1080"`) |
408 | `processed_at` | `str` | *auto* | ISO format processing timestamp |
409
410 ---
411
412 ## Manifest Models
413
414 ### VideoManifest
415
416 The single source of truth for a video processing run. Contains all output paths, inline structured data, and processing statistics.
417
418 | Field | Type | Default | Description |
419 |---|---|---|---|
420 | `version` | `str` | `"1.0"` | Manifest schema version |
421 | `video` | `VideoMetadata` | *required* | Source video metadata |
422 | `stats` | `ProcessingStats` | *default* | Processing statistics |
423 | `transcript_json` | `Optional[str]` | `None` | Relative path to transcript JSON |
424 | `transcript_txt` | `Optional[str]` | `None` | Relative path to transcript text |
425 | `transcript_srt` | `Optional[str]` | `None` | Relative path to SRT subtitles |
426 | `analysis_md` | `Optional[str]` | `None` | Relative path to analysis Markdown |
427 | `analysis_html` | `Optional[str]` | `None` | Relative path to analysis HTML |
428 | `analysis_pdf` | `Optional[str]` | `None` | Relative path to analysis PDF |
429 | `knowledge_graph_json` | `Optional[str]` | `None` | Relative path to knowledge graph JSON |
430 | `knowledge_graph_db` | `Optional[str]` | `None` | Relative path to knowledge graph DB |
431 | `key_points_json` | `Optional[str]` | `None` | Relative path to key points JSON |
432 | `action_items_json` | `Optional[str]` | `None` | Relative path to action items JSON |
433 | `key_points` | `List[KeyPoint]` | `[]` | Inline key points data |
434 | `action_items` | `List[ActionItem]` | `[]` | Inline action items data |
435 | `diagrams` | `List[DiagramResult]` | `[]` | Inline diagram results |
436 | `screen_captures` | `List[ScreenCapture]` | `[]` | Inline screen captures |
437 | `frame_paths` | `List[str]` | `[]` | Relative paths to extracted frames |
438
439 ```python
440 from video_processor.models import VideoManifest, VideoMetadata
441
442 manifest = VideoManifest(
443 video=VideoMetadata(title="Sprint Planning"),
444 key_points=[...],
445 action_items=[...],
446 diagrams=[...],
447 )
448
449 # Serialize to JSON
450 manifest.model_dump_json(indent=2)
451
452 # Load from file
453 loaded = VideoManifest.model_validate_json(Path("manifest.json").read_text())
454 ```
455
456 ### BatchVideoEntry
457
458 Summary of a single video within a batch processing run.
459
460 | Field | Type | Default | Description |
461 |---|---|---|---|
462 | `video_name` | `str` | *required* | Video file name |
463 | `manifest_path` | `str` | *required* | Relative path to the video's manifest file |
464 | `status` | `str` | `"pending"` | Processing status: `"pending"`, `"completed"`, `"failed"` |
465 | `error` | `Optional[str]` | `None` | Error message if processing failed |
466 | `diagrams_count` | `int` | `0` | Number of diagrams detected |
467 | `action_items_count` | `int` | `0` | Number of action items extracted |
468 | `key_points_count` | `int` | `0` | Number of key points extracted |
469 | `duration_seconds` | `Optional[float]` | `None` | Processing duration |
470
471 ### BatchManifest
472
473 Manifest for a batch processing run across multiple videos.
474
475 | Field | Type | Default | Description |
476 |---|---|---|---|
477 | `version` | `str` | `"1.0"` | Manifest schema version |
478 | `title` | `str` | `"Batch Processing Results"` | Batch title |
479 | `processed_at` | `str` | *auto* | ISO format timestamp |
480 | `stats` | `ProcessingStats` | *default* | Aggregated processing statistics |
481 | `videos` | `List[BatchVideoEntry]` | `[]` | Per-video summaries |
482 | `total_videos` | `int` | `0` | Total number of videos in batch |
483 | `completed_videos` | `int` | `0` | Successfully processed videos |
484 | `failed_videos` | `int` | `0` | Videos that failed processing |
485 | `total_diagrams` | `int` | `0` | Total diagrams across all videos |
486 | `total_action_items` | `int` | `0` | Total action items across all videos |
487 | `total_key_points` | `int` | `0` | Total key points across all videos |
488 | `batch_summary_md` | `Optional[str]` | `None` | Relative path to batch summary Markdown |
489 | `merged_knowledge_graph_json` | `Optional[str]` | `None` | Relative path to merged KG JSON |
490 | `merged_knowledge_graph_db` | `Optional[str]` | `None` | Relative path to merged KG database |
491
492 ```python
493 from video_processor.models import BatchManifest
494
495 batch = BatchManifest(
496 title="Weekly Recordings",
497 total_videos=5,
498 completed_videos=4,
499 failed_videos=1,
500 )
501 ```
502
--- docs/api/providers.md
+++ docs/api/providers.md
@@ -3,5 +3,501 @@
33
::: video_processor.providers.base
44
55
::: video_processor.providers.manager
66
77
::: video_processor.providers.discovery
8
+
9
+---
10
+
11
+## Overview
12
+
13
+The provider system abstracts LLM API calls behind a unified interface. It supports multiple providers (OpenAI, Anthropic, Gemini, Ollama, and OpenAI-compatible services), automatic model discovery, capability-based routing, and usage tracking.
14
+
15
+**Key components:**
16
+
17
+- **`BaseProvider`** -- abstract interface that all providers implement
18
+- **`ProviderRegistry`** -- global registry mapping provider names to classes
19
+- **`ProviderManager`** -- high-level router that picks the best provider for each task
20
+- **`discover_available_models()`** -- scans all configured providers for available models
21
+
22
+---
23
+
24
+## BaseProvider (ABC)
25
+
26
+```python
27
+from video_processor.providers.base import BaseProvider
28
+```
29
+
30
+Abstract base class that all provider implementations must subclass. Defines the four core capabilities: chat, vision, audio transcription, and model listing.
31
+
32
+**Class attribute:**
33
+
34
+| Attribute | Type | Description |
35
+|---|---|---|
36
+| `provider_name` | `str` | Identifier for this provider (e.g., `"openai"`, `"anthropic"`) |
37
+
38
+### chat()
39
+
40
+```python
41
+def chat(
42
+ self,
43
+ messages: list[dict],
44
+ max_tokens: int = 4096,
45
+ temperature: float = 0.7,
46
+ model: Optional[str] = None,
47
+) -> str
48
+```
49
+
50
+Send a chat completion request.
51
+
52
+**Parameters:**
53
+
54
+| Parameter | Type | Default | Description |
55
+|---|---|---|---|
56
+| `messages` | `list[dict]` | *required* | OpenAI-format message list (`role`, `content`) |
57
+| `max_tokens` | `int` | `4096` | Maximum tokens in the response |
58
+| `temperature` | `float` | `0.7` | Sampling temperature |
59
+| `model` | `Optional[str]` | `None` | Override model ID |
60
+
61
+**Returns:** `str` -- the assistant's text response.
62
+
63
+### analyze_image()
64
+
65
+```python
66
+def analyze_image(
67
+ self,
68
+ image_bytes: bytes,
69
+ prompt: str,
70
+ max_tokens: int = 4096,
71
+ model: Optional[str] = None,
72
+) -> str
73
+```
74
+
75
+Analyze an image with a text prompt using a vision-capable model.
76
+
77
+**Parameters:**
78
+
79
+| Parameter | Type | Default | Description |
80
+|---|---|---|---|
81
+| `image_bytes` | `bytes` | *required* | Raw image data (JPEG, PNG, etc.) |
82
+| `prompt` | `str` | *required* | Analysis instructions |
83
+| `max_tokens` | `int` | `4096` | Maximum tokens in the response |
84
+| `model` | `Optional[str]` | `None` | Override model ID |
85
+
86
+**Returns:** `str` -- the assistant's analysis text.
87
+
88
+### transcribe_audio()
89
+
90
+```python
91
+def transcribe_audio(
92
+ self,
93
+ audio_path: str | Path,
94
+ language: Optional[str] = None,
95
+ model: Optional[str] = None,
96
+) -> dict
97
+```
98
+
99
+Transcribe an audio file.
100
+
101
+**Parameters:**
102
+
103
+| Parameter | Type | Default | Description |
104
+|---|---|---|---|
105
+| `audio_path` | `str \| Path` | *required* | Path to the audio file |
106
+| `language` | `Optional[str]` | `None` | Language hint (ISO 639-1 code) |
107
+| `model` | `Optional[str]` | `None` | Override model ID |
108
+
109
+**Returns:** `dict` -- transcription result with keys `text`, `segments`, `duration`, etc.
110
+
111
+### list_models()
112
+
113
+```python
114
+def list_models(self) -> list[ModelInfo]
115
+```
116
+
117
+Discover available models from this provider's API.
118
+
119
+**Returns:** `list[ModelInfo]` -- available models with capability metadata.
120
+
121
+---
122
+
123
+## ModelInfo
124
+
125
+```python
126
+from video_processor.providers.base import ModelInfo
127
+```
128
+
129
+Pydantic model describing an available model from a provider.
130
+
131
+| Field | Type | Default | Description |
132
+|---|---|---|---|
133
+| `id` | `str` | *required* | Model identifier (e.g., `"gpt-4o"`, `"claude-haiku-4-5-20251001"`) |
134
+| `provider` | `str` | *required* | Provider name (e.g., `"openai"`, `"anthropic"`, `"gemini"`) |
135
+| `display_name` | `str` | `""` | Human-readable display name |
136
+| `capabilities` | `List[str]` | `[]` | Model capabilities: `"chat"`, `"vision"`, `"audio"`, `"embedding"` |
137
+
138
+```json
139
+{
140
+ "id": "gpt-4o",
141
+ "provider": "openai",
142
+ "display_name": "GPT-4o",
143
+ "capabilities": ["chat", "vision"]
144
+}
145
+```
146
+
147
+---
148
+
149
+## ProviderRegistry
150
+
151
+```python
152
+from video_processor.providers.base import ProviderRegistry
153
+```
154
+
155
+Class-level registry for provider classes. Providers register themselves with metadata on import. This registry is used internally by `ProviderManager` but can also be used directly for introspection.
156
+
157
+### register()
158
+
159
+```python
160
+@classmethod
161
+def register(
162
+ cls,
163
+ name: str,
164
+ provider_class: type,
165
+ env_var: str = "",
166
+ model_prefixes: Optional[List[str]] = None,
167
+ default_models: Optional[Dict[str, str]] = None,
168
+) -> None
169
+```
170
+
171
+Register a provider class with its metadata. Called by each provider module at import time.
172
+
173
+**Parameters:**
174
+
175
+| Parameter | Type | Default | Description |
176
+|---|---|---|---|
177
+| `name` | `str` | *required* | Provider name (e.g., `"openai"`) |
178
+| `provider_class` | `type` | *required* | The provider class |
179
+| `env_var` | `str` | `""` | Environment variable for API key |
180
+| `model_prefixes` | `Optional[List[str]]` | `None` | Model ID prefixes for auto-detection (e.g., `["gpt-", "o1-"]`) |
181
+| `default_models` | `Optional[Dict[str, str]]` | `None` | Default models per capability (e.g., `{"chat": "gpt-4o", "vision": "gpt-4o"}`) |
182
+
183
+### get()
184
+
185
+```python
186
+@classmethod
187
+def get(cls, name: str) -> type
188
+```
189
+
190
+Return the provider class for a given name. Raises `ValueError` if the provider is not registered.
191
+
192
+### get_by_model()
193
+
194
+```python
195
+@classmethod
196
+def get_by_model(cls, model_id: str) -> Optional[str]
197
+```
198
+
199
+Return the provider name for a model ID based on prefix matching. Returns `None` if no match is found.
200
+
201
+### get_default_models()
202
+
203
+```python
204
+@classmethod
205
+def get_default_models(cls, name: str) -> Dict[str, str]
206
+```
207
+
208
+Return the default models dict for a provider, mapping capability names to model IDs.
209
+
210
+### available()
211
+
212
+```python
213
+@classmethod
214
+def available(cls) -> List[str]
215
+```
216
+
217
+Return names of providers whose required environment variable is set (or providers with no env var requirement, like Ollama).
218
+
219
+### all_registered()
220
+
221
+```python
222
+@classmethod
223
+def all_registered(cls) -> Dict[str, Dict]
224
+```
225
+
226
+Return all registered providers and their metadata dictionaries.
227
+
228
+---
229
+
230
+## OpenAICompatibleProvider
231
+
232
+```python
233
+from video_processor.providers.base import OpenAICompatibleProvider
234
+```
235
+
236
+Base class for providers using OpenAI-compatible APIs (Together, Fireworks, Cerebras, xAI, Azure). Implements `chat()`, `analyze_image()`, and `list_models()` using the OpenAI client library. `transcribe_audio()` raises `NotImplementedError` by default.
237
+
238
+**Constructor:**
239
+
240
+```python
241
+def __init__(self, api_key: Optional[str] = None, base_url: Optional[str] = None)
242
+```
243
+
244
+| Parameter | Type | Default | Description |
245
+|---|---|---|---|
246
+| `api_key` | `Optional[str]` | `None` | API key (falls back to `self.env_var` environment variable) |
247
+| `base_url` | `Optional[str]` | `None` | API base URL (falls back to `self.base_url` class attribute) |
248
+
249
+**Subclass attributes to override:**
250
+
251
+| Attribute | Description |
252
+|---|---|
253
+| `provider_name` | Provider identifier string |
254
+| `base_url` | Default API base URL |
255
+| `env_var` | Environment variable name for the API key |
256
+
257
+**Usage tracking:** After each `chat()` or `analyze_image()` call, the provider stores token counts in `self._last_usage` as `{"input_tokens": int, "output_tokens": int}`. This is consumed by `ProviderManager._track()`.
258
+
259
+---
260
+
261
+## ProviderManager
262
+
263
+```python
264
+from video_processor.providers.manager import ProviderManager
265
+```
266
+
267
+High-level router that selects the best available provider and model for each API call. Supports explicit model selection, forced provider, or automatic selection based on discovered capabilities.
268
+
269
+### Constructor
270
+
271
+```python
272
+def __init__(
273
+ self,
274
+ vision_model: Optional[str] = None,
275
+ chat_model: Optional[str] = None,
276
+ transcription_model: Optional[str] = None,
277
+ provider: Optional[str] = None,
278
+ auto: bool = True,
279
+)
280
+```
281
+
282
+| Parameter | Type | Default | Description |
283
+|---|---|---|---|
284
+| `vision_model` | `Optional[str]` | `None` | Override model for vision tasks (e.g., `"gpt-4o"`) |
285
+| `chat_model` | `Optional[str]` | `None` | Override model for chat/LLM tasks |
286
+| `transcription_model` | `Optional[str]` | `None` | Override model for transcription |
287
+| `provider` | `Optional[str]` | `None` | Force all tasks to a single provider |
288
+| `auto` | `bool` | `True` | If `True` and no model specified, pick the best available |
289
+
290
+**Attributes:**
291
+
292
+| Attribute | Type | Description |
293
+|---|---|---|
294
+| `usage` | `UsageTracker` | Tracks token counts and API costs across all calls |
295
+
296
+### Auto-selection preferences
297
+
298
+When `auto=True` and no explicit model is set, providers are tried in this order:
299
+
300
+**Vision:** Gemini (`gemini-2.5-flash`) > OpenAI (`gpt-4o-mini`) > Anthropic (`claude-haiku-4-5-20251001`)
301
+
302
+**Chat:** Anthropic (`claude-haiku-4-5-20251001`) > OpenAI (`gpt-4o-mini`) > Gemini (`gemini-2.5-flash`)
303
+
304
+**Transcription:** OpenAI (`whisper-1`) > Gemini (`gemini-2.5-flash`)
305
+
306
+If no API-key-based provider is available, Ollama is tried as a fallback.
307
+
308
+### chat()
309
+
310
+```python
311
+def chat(
312
+ self,
313
+ messages: list[dict],
314
+ max_tokens: int = 4096,
315
+ temperature: float = 0.7,
316
+) -> str
317
+```
318
+
319
+Send a chat completion to the best available provider. Automatically resolves which provider and model to use.
320
+
321
+**Parameters:**
322
+
323
+| Parameter | Type | Default | Description |
324
+|---|---|---|---|
325
+| `messages` | `list[dict]` | *required* | OpenAI-format messages |
326
+| `max_tokens` | `int` | `4096` | Maximum response tokens |
327
+| `temperature` | `float` | `0.7` | Sampling temperature |
328
+
329
+**Returns:** `str` -- assistant response text.
330
+
331
+**Raises:** `RuntimeError` if no provider is available for the `chat` capability.
332
+
333
+### analyze_image()
334
+
335
+```python
336
+def analyze_image(
337
+ self,
338
+ image_bytes: bytes,
339
+ prompt: str,
340
+ max_tokens: int = 4096,
341
+) -> str
342
+```
343
+
344
+Analyze an image using the best available vision provider.
345
+
346
+**Returns:** `str` -- analysis text.
347
+
348
+**Raises:** `RuntimeError` if no provider is available for the `vision` capability.
349
+
350
+### transcribe_audio()
351
+
352
+```python
353
+def transcribe_audio(
354
+ self,
355
+ audio_path: str | Path,
356
+ language: Optional[str] = None,
357
+ speaker_hints: Optional[list[str]] = None,
358
+) -> dict
359
+```
360
+
361
+Transcribe audio. Prefers local Whisper (no file size limits, no API costs) when available, falling back to API-based transcription.
362
+
363
+**Parameters:**
364
+
365
+| Parameter | Type | Default | Description |
366
+|---|---|---|---|
367
+| `audio_path` | `str \| Path` | *required* | Path to the audio file |
368
+| `language` | `Optional[str]` | `None` | Language hint |
369
+| `speaker_hints` | `Optional[list[str]]` | `None` | Speaker names for better recognition |
370
+
371
+**Returns:** `dict` -- transcription result with `text`, `segments`, `duration`.
372
+
373
+**Local Whisper:** If `transcription_model` is unset or starts with `"whisper-local"`, the manager tries local Whisper first. Use `"whisper-local:large"` to specify a model size.
374
+
375
+### get_models_used()
376
+
377
+```python
378
+def get_models_used(self) -> dict[str, str]
379
+```
380
+
381
+Return a dict mapping capability to `"provider/model"` string for tracking purposes.
382
+
383
+```python
384
+pm = ProviderManager()
385
+print(pm.get_models_used())
386
+# {"vision": "gemini/gemini-2.5-flash", "chat": "anthropic/claude-haiku-4-5-20251001", ...}
387
+```
388
+
389
+### Usage examples
390
+
391
+```python
392
+from video_processor.providers.manager import ProviderManager
393
+
394
+# Auto-select best providers
395
+pm = ProviderManager()
396
+
397
+# Force everything through one provider
398
+pm = ProviderManager(provider="openai")
399
+
400
+# Explicit model selection
401
+pm = ProviderManager(
402
+ vision_model="gpt-4o",
403
+ chat_model="claude-haiku-4-5-20251001",
404
+ transcription_model="whisper-local:large",
405
+)
406
+
407
+# Chat completion
408
+response = pm.chat([
409
+ {"role": "user", "content": "Summarize this meeting transcript..."}
410
+])
411
+
412
+# Image analysis
413
+with open("diagram.png", "rb") as f:
414
+ analysis = pm.analyze_image(f.read(), "Describe this architecture diagram")
415
+
416
+# Transcription with speaker hints
417
+result = pm.transcribe_audio(
418
+ "meeting.mp3",
419
+ language="en",
420
+ speaker_hints=["Alice", "Bob", "Charlie"],
421
+)
422
+
423
+# Check usage
424
+print(pm.usage.summary())
425
+```
426
+
427
+---
428
+
429
+## discover_available_models()
430
+
431
+```python
432
+from video_processor.providers.discovery import discover_available_models
433
+```
434
+
435
+```python
436
+def discover_available_models(
437
+ api_keys: Optional[dict[str, str]] = None,
438
+ force_refresh: bool = False,
439
+) -> list[ModelInfo]
440
+```
441
+
442
+Discover available models from all configured providers. For each provider with a valid API key, calls `list_models()` and returns a unified, sorted list.
443
+
444
+**Parameters:**
445
+
446
+| Parameter | Type | Default | Description |
447
+|---|---|---|---|
448
+| `api_keys` | `Optional[dict[str, str]]` | `None` | Override API keys (defaults to environment variables) |
449
+| `force_refresh` | `bool` | `False` | Force re-discovery, ignoring the session cache |
450
+
451
+**Returns:** `list[ModelInfo]` -- all discovered models, sorted by provider then model ID.
452
+
453
+**Caching:** Results are cached for the session. Use `force_refresh=True` or `clear_discovery_cache()` to refresh.
454
+
455
+```python
456
+from video_processor.providers.discovery import (
457
+ discover_available_models,
458
+ clear_discovery_cache,
459
+)
460
+
461
+# Discover models using environment variables
462
+models = discover_available_models()
463
+for m in models:
464
+ print(f"{m.provider}/{m.id} - {m.capabilities}")
465
+
466
+# Force refresh
467
+models = discover_available_models(force_refresh=True)
468
+
469
+# Override API keys
470
+models = discover_available_models(api_keys={
471
+ "openai": "sk-...",
472
+ "anthropic": "sk-ant-...",
473
+})
474
+
475
+# Clear cache
476
+clear_discovery_cache()
477
+```
478
+
479
+### clear_discovery_cache()
480
+
481
+```python
482
+def clear_discovery_cache() -> None
483
+```
484
+
485
+Clear the cached model list, forcing the next `discover_available_models()` call to re-query providers.
486
+
487
+---
488
+
489
+## Built-in Providers
490
+
491
+The following providers are registered automatically when the provider system initializes:
492
+
493
+| Provider | Environment Variable | Capabilities | Default Chat Model |
494
+|---|---|---|---|
495
+| `openai` | `OPENAI_API_KEY` | chat, vision, audio | `gpt-4o-mini` |
496
+| `anthropic` | `ANTHROPIC_API_KEY` | chat, vision | `claude-haiku-4-5-20251001` |
497
+| `gemini` | `GEMINI_API_KEY` | chat, vision, audio | `gemini-2.5-flash` |
498
+| `ollama` | *(none -- checks server)* | chat, vision | *(depends on installed models)* |
499
+| `together` | `TOGETHER_API_KEY` | chat | *(varies)* |
500
+| `fireworks` | `FIREWORKS_API_KEY` | chat | *(varies)* |
501
+| `cerebras` | `CEREBRAS_API_KEY` | chat | *(varies)* |
502
+| `xai` | `XAI_API_KEY` | chat | *(varies)* |
503
+| `azure` | `AZURE_OPENAI_API_KEY` | chat, vision | *(varies)* |
8504
9505
ADDED docs/api/sources.md
--- docs/api/providers.md
+++ docs/api/providers.md
@@ -3,5 +3,501 @@
3 ::: video_processor.providers.base
4
5 ::: video_processor.providers.manager
6
7 ::: video_processor.providers.discovery
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
9 DDED docs/api/sources.md
--- docs/api/providers.md
+++ docs/api/providers.md
@@ -3,5 +3,501 @@
3 ::: video_processor.providers.base
4
5 ::: video_processor.providers.manager
6
7 ::: video_processor.providers.discovery
8
9 ---
10
11 ## Overview
12
13 The provider system abstracts LLM API calls behind a unified interface. It supports multiple providers (OpenAI, Anthropic, Gemini, Ollama, and OpenAI-compatible services), automatic model discovery, capability-based routing, and usage tracking.
14
15 **Key components:**
16
17 - **`BaseProvider`** -- abstract interface that all providers implement
18 - **`ProviderRegistry`** -- global registry mapping provider names to classes
19 - **`ProviderManager`** -- high-level router that picks the best provider for each task
20 - **`discover_available_models()`** -- scans all configured providers for available models
21
22 ---
23
24 ## BaseProvider (ABC)
25
26 ```python
27 from video_processor.providers.base import BaseProvider
28 ```
29
30 Abstract base class that all provider implementations must subclass. Defines the four core capabilities: chat, vision, audio transcription, and model listing.
31
32 **Class attribute:**
33
34 | Attribute | Type | Description |
35 |---|---|---|
36 | `provider_name` | `str` | Identifier for this provider (e.g., `"openai"`, `"anthropic"`) |
37
38 ### chat()
39
40 ```python
41 def chat(
42 self,
43 messages: list[dict],
44 max_tokens: int = 4096,
45 temperature: float = 0.7,
46 model: Optional[str] = None,
47 ) -> str
48 ```
49
50 Send a chat completion request.
51
52 **Parameters:**
53
54 | Parameter | Type | Default | Description |
55 |---|---|---|---|
56 | `messages` | `list[dict]` | *required* | OpenAI-format message list (`role`, `content`) |
57 | `max_tokens` | `int` | `4096` | Maximum tokens in the response |
58 | `temperature` | `float` | `0.7` | Sampling temperature |
59 | `model` | `Optional[str]` | `None` | Override model ID |
60
61 **Returns:** `str` -- the assistant's text response.
62
63 ### analyze_image()
64
65 ```python
66 def analyze_image(
67 self,
68 image_bytes: bytes,
69 prompt: str,
70 max_tokens: int = 4096,
71 model: Optional[str] = None,
72 ) -> str
73 ```
74
75 Analyze an image with a text prompt using a vision-capable model.
76
77 **Parameters:**
78
79 | Parameter | Type | Default | Description |
80 |---|---|---|---|
81 | `image_bytes` | `bytes` | *required* | Raw image data (JPEG, PNG, etc.) |
82 | `prompt` | `str` | *required* | Analysis instructions |
83 | `max_tokens` | `int` | `4096` | Maximum tokens in the response |
84 | `model` | `Optional[str]` | `None` | Override model ID |
85
86 **Returns:** `str` -- the assistant's analysis text.
87
88 ### transcribe_audio()
89
90 ```python
91 def transcribe_audio(
92 self,
93 audio_path: str | Path,
94 language: Optional[str] = None,
95 model: Optional[str] = None,
96 ) -> dict
97 ```
98
99 Transcribe an audio file.
100
101 **Parameters:**
102
103 | Parameter | Type | Default | Description |
104 |---|---|---|---|
105 | `audio_path` | `str \| Path` | *required* | Path to the audio file |
106 | `language` | `Optional[str]` | `None` | Language hint (ISO 639-1 code) |
107 | `model` | `Optional[str]` | `None` | Override model ID |
108
109 **Returns:** `dict` -- transcription result with keys `text`, `segments`, `duration`, etc.
110
111 ### list_models()
112
113 ```python
114 def list_models(self) -> list[ModelInfo]
115 ```
116
117 Discover available models from this provider's API.
118
119 **Returns:** `list[ModelInfo]` -- available models with capability metadata.
120
121 ---
122
123 ## ModelInfo
124
125 ```python
126 from video_processor.providers.base import ModelInfo
127 ```
128
129 Pydantic model describing an available model from a provider.
130
131 | Field | Type | Default | Description |
132 |---|---|---|---|
133 | `id` | `str` | *required* | Model identifier (e.g., `"gpt-4o"`, `"claude-haiku-4-5-20251001"`) |
134 | `provider` | `str` | *required* | Provider name (e.g., `"openai"`, `"anthropic"`, `"gemini"`) |
135 | `display_name` | `str` | `""` | Human-readable display name |
136 | `capabilities` | `List[str]` | `[]` | Model capabilities: `"chat"`, `"vision"`, `"audio"`, `"embedding"` |
137
138 ```json
139 {
140 "id": "gpt-4o",
141 "provider": "openai",
142 "display_name": "GPT-4o",
143 "capabilities": ["chat", "vision"]
144 }
145 ```
146
147 ---
148
149 ## ProviderRegistry
150
151 ```python
152 from video_processor.providers.base import ProviderRegistry
153 ```
154
155 Class-level registry for provider classes. Providers register themselves with metadata on import. This registry is used internally by `ProviderManager` but can also be used directly for introspection.
156
157 ### register()
158
159 ```python
160 @classmethod
161 def register(
162 cls,
163 name: str,
164 provider_class: type,
165 env_var: str = "",
166 model_prefixes: Optional[List[str]] = None,
167 default_models: Optional[Dict[str, str]] = None,
168 ) -> None
169 ```
170
171 Register a provider class with its metadata. Called by each provider module at import time.
172
173 **Parameters:**
174
175 | Parameter | Type | Default | Description |
176 |---|---|---|---|
177 | `name` | `str` | *required* | Provider name (e.g., `"openai"`) |
178 | `provider_class` | `type` | *required* | The provider class |
179 | `env_var` | `str` | `""` | Environment variable for API key |
180 | `model_prefixes` | `Optional[List[str]]` | `None` | Model ID prefixes for auto-detection (e.g., `["gpt-", "o1-"]`) |
181 | `default_models` | `Optional[Dict[str, str]]` | `None` | Default models per capability (e.g., `{"chat": "gpt-4o", "vision": "gpt-4o"}`) |
182
183 ### get()
184
185 ```python
186 @classmethod
187 def get(cls, name: str) -> type
188 ```
189
190 Return the provider class for a given name. Raises `ValueError` if the provider is not registered.
191
192 ### get_by_model()
193
194 ```python
195 @classmethod
196 def get_by_model(cls, model_id: str) -> Optional[str]
197 ```
198
199 Return the provider name for a model ID based on prefix matching. Returns `None` if no match is found.
200
201 ### get_default_models()
202
203 ```python
204 @classmethod
205 def get_default_models(cls, name: str) -> Dict[str, str]
206 ```
207
208 Return the default models dict for a provider, mapping capability names to model IDs.
209
210 ### available()
211
212 ```python
213 @classmethod
214 def available(cls) -> List[str]
215 ```
216
217 Return names of providers whose required environment variable is set (or providers with no env var requirement, like Ollama).
218
219 ### all_registered()
220
221 ```python
222 @classmethod
223 def all_registered(cls) -> Dict[str, Dict]
224 ```
225
226 Return all registered providers and their metadata dictionaries.
227
228 ---
229
230 ## OpenAICompatibleProvider
231
232 ```python
233 from video_processor.providers.base import OpenAICompatibleProvider
234 ```
235
236 Base class for providers using OpenAI-compatible APIs (Together, Fireworks, Cerebras, xAI, Azure). Implements `chat()`, `analyze_image()`, and `list_models()` using the OpenAI client library. `transcribe_audio()` raises `NotImplementedError` by default.
237
238 **Constructor:**
239
240 ```python
241 def __init__(self, api_key: Optional[str] = None, base_url: Optional[str] = None)
242 ```
243
244 | Parameter | Type | Default | Description |
245 |---|---|---|---|
246 | `api_key` | `Optional[str]` | `None` | API key (falls back to `self.env_var` environment variable) |
247 | `base_url` | `Optional[str]` | `None` | API base URL (falls back to `self.base_url` class attribute) |
248
249 **Subclass attributes to override:**
250
251 | Attribute | Description |
252 |---|---|
253 | `provider_name` | Provider identifier string |
254 | `base_url` | Default API base URL |
255 | `env_var` | Environment variable name for the API key |
256
257 **Usage tracking:** After each `chat()` or `analyze_image()` call, the provider stores token counts in `self._last_usage` as `{"input_tokens": int, "output_tokens": int}`. This is consumed by `ProviderManager._track()`.
258
259 ---
260
261 ## ProviderManager
262
263 ```python
264 from video_processor.providers.manager import ProviderManager
265 ```
266
267 High-level router that selects the best available provider and model for each API call. Supports explicit model selection, forced provider, or automatic selection based on discovered capabilities.
268
269 ### Constructor
270
271 ```python
272 def __init__(
273 self,
274 vision_model: Optional[str] = None,
275 chat_model: Optional[str] = None,
276 transcription_model: Optional[str] = None,
277 provider: Optional[str] = None,
278 auto: bool = True,
279 )
280 ```
281
282 | Parameter | Type | Default | Description |
283 |---|---|---|---|
284 | `vision_model` | `Optional[str]` | `None` | Override model for vision tasks (e.g., `"gpt-4o"`) |
285 | `chat_model` | `Optional[str]` | `None` | Override model for chat/LLM tasks |
286 | `transcription_model` | `Optional[str]` | `None` | Override model for transcription |
287 | `provider` | `Optional[str]` | `None` | Force all tasks to a single provider |
288 | `auto` | `bool` | `True` | If `True` and no model specified, pick the best available |
289
290 **Attributes:**
291
292 | Attribute | Type | Description |
293 |---|---|---|
294 | `usage` | `UsageTracker` | Tracks token counts and API costs across all calls |
295
296 ### Auto-selection preferences
297
298 When `auto=True` and no explicit model is set, providers are tried in this order:
299
300 **Vision:** Gemini (`gemini-2.5-flash`) > OpenAI (`gpt-4o-mini`) > Anthropic (`claude-haiku-4-5-20251001`)
301
302 **Chat:** Anthropic (`claude-haiku-4-5-20251001`) > OpenAI (`gpt-4o-mini`) > Gemini (`gemini-2.5-flash`)
303
304 **Transcription:** OpenAI (`whisper-1`) > Gemini (`gemini-2.5-flash`)
305
306 If no API-key-based provider is available, Ollama is tried as a fallback.
307
308 ### chat()
309
310 ```python
311 def chat(
312 self,
313 messages: list[dict],
314 max_tokens: int = 4096,
315 temperature: float = 0.7,
316 ) -> str
317 ```
318
319 Send a chat completion to the best available provider. Automatically resolves which provider and model to use.
320
321 **Parameters:**
322
323 | Parameter | Type | Default | Description |
324 |---|---|---|---|
325 | `messages` | `list[dict]` | *required* | OpenAI-format messages |
326 | `max_tokens` | `int` | `4096` | Maximum response tokens |
327 | `temperature` | `float` | `0.7` | Sampling temperature |
328
329 **Returns:** `str` -- assistant response text.
330
331 **Raises:** `RuntimeError` if no provider is available for the `chat` capability.
332
333 ### analyze_image()
334
335 ```python
336 def analyze_image(
337 self,
338 image_bytes: bytes,
339 prompt: str,
340 max_tokens: int = 4096,
341 ) -> str
342 ```
343
344 Analyze an image using the best available vision provider.
345
346 **Returns:** `str` -- analysis text.
347
348 **Raises:** `RuntimeError` if no provider is available for the `vision` capability.
349
350 ### transcribe_audio()
351
352 ```python
353 def transcribe_audio(
354 self,
355 audio_path: str | Path,
356 language: Optional[str] = None,
357 speaker_hints: Optional[list[str]] = None,
358 ) -> dict
359 ```
360
361 Transcribe audio. Prefers local Whisper (no file size limits, no API costs) when available, falling back to API-based transcription.
362
363 **Parameters:**
364
365 | Parameter | Type | Default | Description |
366 |---|---|---|---|
367 | `audio_path` | `str \| Path` | *required* | Path to the audio file |
368 | `language` | `Optional[str]` | `None` | Language hint |
369 | `speaker_hints` | `Optional[list[str]]` | `None` | Speaker names for better recognition |
370
371 **Returns:** `dict` -- transcription result with `text`, `segments`, `duration`.
372
373 **Local Whisper:** If `transcription_model` is unset or starts with `"whisper-local"`, the manager tries local Whisper first. Use `"whisper-local:large"` to specify a model size.
374
375 ### get_models_used()
376
377 ```python
378 def get_models_used(self) -> dict[str, str]
379 ```
380
381 Return a dict mapping capability to `"provider/model"` string for tracking purposes.
382
383 ```python
384 pm = ProviderManager()
385 print(pm.get_models_used())
386 # {"vision": "gemini/gemini-2.5-flash", "chat": "anthropic/claude-haiku-4-5-20251001", ...}
387 ```
388
389 ### Usage examples
390
391 ```python
392 from video_processor.providers.manager import ProviderManager
393
394 # Auto-select best providers
395 pm = ProviderManager()
396
397 # Force everything through one provider
398 pm = ProviderManager(provider="openai")
399
400 # Explicit model selection
401 pm = ProviderManager(
402 vision_model="gpt-4o",
403 chat_model="claude-haiku-4-5-20251001",
404 transcription_model="whisper-local:large",
405 )
406
407 # Chat completion
408 response = pm.chat([
409 {"role": "user", "content": "Summarize this meeting transcript..."}
410 ])
411
412 # Image analysis
413 with open("diagram.png", "rb") as f:
414 analysis = pm.analyze_image(f.read(), "Describe this architecture diagram")
415
416 # Transcription with speaker hints
417 result = pm.transcribe_audio(
418 "meeting.mp3",
419 language="en",
420 speaker_hints=["Alice", "Bob", "Charlie"],
421 )
422
423 # Check usage
424 print(pm.usage.summary())
425 ```
426
427 ---
428
429 ## discover_available_models()
430
431 ```python
432 from video_processor.providers.discovery import discover_available_models
433 ```
434
435 ```python
436 def discover_available_models(
437 api_keys: Optional[dict[str, str]] = None,
438 force_refresh: bool = False,
439 ) -> list[ModelInfo]
440 ```
441
442 Discover available models from all configured providers. For each provider with a valid API key, calls `list_models()` and returns a unified, sorted list.
443
444 **Parameters:**
445
446 | Parameter | Type | Default | Description |
447 |---|---|---|---|
448 | `api_keys` | `Optional[dict[str, str]]` | `None` | Override API keys (defaults to environment variables) |
449 | `force_refresh` | `bool` | `False` | Force re-discovery, ignoring the session cache |
450
451 **Returns:** `list[ModelInfo]` -- all discovered models, sorted by provider then model ID.
452
453 **Caching:** Results are cached for the session. Use `force_refresh=True` or `clear_discovery_cache()` to refresh.
454
455 ```python
456 from video_processor.providers.discovery import (
457 discover_available_models,
458 clear_discovery_cache,
459 )
460
461 # Discover models using environment variables
462 models = discover_available_models()
463 for m in models:
464 print(f"{m.provider}/{m.id} - {m.capabilities}")
465
466 # Force refresh
467 models = discover_available_models(force_refresh=True)
468
469 # Override API keys
470 models = discover_available_models(api_keys={
471 "openai": "sk-...",
472 "anthropic": "sk-ant-...",
473 })
474
475 # Clear cache
476 clear_discovery_cache()
477 ```
478
479 ### clear_discovery_cache()
480
481 ```python
482 def clear_discovery_cache() -> None
483 ```
484
485 Clear the cached model list, forcing the next `discover_available_models()` call to re-query providers.
486
487 ---
488
489 ## Built-in Providers
490
491 The following providers are registered automatically when the provider system initializes:
492
493 | Provider | Environment Variable | Capabilities | Default Chat Model |
494 |---|---|---|---|
495 | `openai` | `OPENAI_API_KEY` | chat, vision, audio | `gpt-4o-mini` |
496 | `anthropic` | `ANTHROPIC_API_KEY` | chat, vision | `claude-haiku-4-5-20251001` |
497 | `gemini` | `GEMINI_API_KEY` | chat, vision, audio | `gemini-2.5-flash` |
498 | `ollama` | *(none -- checks server)* | chat, vision | *(depends on installed models)* |
499 | `together` | `TOGETHER_API_KEY` | chat | *(varies)* |
500 | `fireworks` | `FIREWORKS_API_KEY` | chat | *(varies)* |
501 | `cerebras` | `CEREBRAS_API_KEY` | chat | *(varies)* |
502 | `xai` | `XAI_API_KEY` | chat | *(varies)* |
503 | `azure` | `AZURE_OPENAI_API_KEY` | chat, vision | *(varies)* |
504
505 DDED docs/api/sources.md
--- a/docs/api/sources.md
+++ b/docs/api/sources.md
@@ -0,0 +1,281 @@
1
+# Sources API Reference
2
+
3
+::: video_processor.sources.base
4
+
5
+---
6
+
7
+## Overview
8
+
9
+The sources module provides a unified interface for fetching content from cloud services, local applications, and the web. All sources implement the `BaseSource` abstract class, providing consistent `authenticate()`, `list_videos()`, and `download()` methods.
10
+
11
+Sources are lazy-loaded to avoid pulling in optional dependencies at import time. You can import any source directly from `video_processor.sources` and the correct module will be loaded on demand.
12
+
13
+---
14
+
15
+## BaseSource (ABC)
16
+
17
+```python
18
+from video_processor.sources import BaseSource
19
+```
20
+
21
+Abstract base class that all source integrations implement. Defines the standard three-step workflow: authenticate, list, download.
22
+
23
+### authenticate()
24
+
25
+```python
26
+@abstractmethod
27
+def authenticate(self) -> bool
28
+```
29
+
30
+Authenticate with the cloud provider or service. Uses the auth strategy defined for the source (OAuth, API key, local access, etc.).
31
+
32
+**Returns:** `bool` -- `True` on successful authentication, `False` on failure.
33
+
34
+### list_videos()
35
+
36
+```python
37
+@abstractmethod
38
+def list_videos(
39
+ self,
40
+ folder_id: Optional[str] = None,
41
+ folder_path: Optional[str] = None,
42
+ patterns: Optional[List[str]] = None,
43
+) -> List[SourceFile]
44
+```
45
+
46
+List available video files (or other content, depending on the source).
47
+
48
+**Parameters:**
49
+
50
+| Parameter | Type | Default | Description |
51
+|---|---|---|---|
52
+| `folder_id` | `Optional[str]` | `None` | Provider-specific folder/container identifier |
53
+| `folder_path` | `Optional[str]` | `None` | Path within the source (e.g., folder name) |
54
+| `patterns` | `Optional[List[str]]` | `None` | File name glob patterns to filter results |
55
+
56
+**Returns:** `List[SourceFile]` -- available files matching the criteria.
57
+
58
+### download()
59
+
60
+```python
61
+@abstractmethod
62
+def download(
63
+ self,
64
+ file: SourceFile,
65
+ destination: Path,
66
+) -> Path
67
+```
68
+
69
+Download a single file to a local path.
70
+
71
+**Parameters:**
72
+
73
+| Parameter | Type | Description |
74
+|---|---|---|
75
+| `file` | `SourceFile` | File descriptor from `list_videos()` |
76
+| `destination` | `Path` | Local destination path |
77
+
78
+**Returns:** `Path` -- the local path where the file was saved.
79
+
80
+### download_all()
81
+
82
+```python
83
+def download_all(
84
+ self,
85
+ files: List[SourceFile],
86
+ destination_dir: Path,
87
+) -> List[Path]
88
+```
89
+
90
+Download multiple files to a directory, preserving subfolder structure from `SourceFile.path`. This is a concrete method provided by the base class.
91
+
92
+**Parameters:**
93
+
94
+| Parameter | Type | Description |
95
+|---|---|---|
96
+| `files` | `List[SourceFile]` | Files to download |
97
+| `destination_dir` | `Path` | Base directory for downloads (created if needed) |
98
+
99
+**Returns:** `List[Path]` -- local paths of successfully downloaded files. Failed downloads are logged and skipped.
100
+
101
+---
102
+
103
+## SourceFile
104
+
105
+```python
106
+from video_processor.sources import SourceFile
107
+```
108
+
109
+Pydantic model describing a file available in a cloud source.
110
+
111
+| Field | Type | Default | Description |
112
+|---|---|---|---|
113
+| `name` | `str` | *required* | File name |
114
+| `id` | `str` | *required* | Provider-specific file identifier |
115
+| `size_bytes` | `Optional[int]` | `None` | File size in bytes |
116
+| `mime_type` | `Optional[str]` | `None` | MIME type (e.g., `"video/mp4"`) |
117
+| `modified_at` | `Optional[str]` | `None` | Last modified timestamp |
118
+| `path` | `Optional[str]` | `None` | Path within the source folder (used for subfolder structure in `download_all`) |
119
+
120
+```json
121
+{
122
+ "name": "sprint-review-2026-03-01.mp4",
123
+ "id": "abc123def456",
124
+ "size_bytes": 524288000,
125
+ "mime_type": "video/mp4",
126
+ "modified_at": "2026-03-01T14:30:00Z",
127
+ "path": "recordings/march/sprint-review-2026-03-01.mp4"
128
+}
129
+```
130
+
131
+---
132
+
133
+## Lazy Loading Pattern
134
+
135
+All sources are lazy-loaded via `__getattr__` in the package `__init__.py`. This means importing `video_processor.sources` does not pull in any external dependencies (e.g., `google-auth`, `msal`, `notion-client`). The actual module is loaded only when you access the class.
136
+
137
+```python
138
+# This import is instant -- no dependencies loaded
139
+from video_processor.sources import ZoomSource
140
+
141
+# The zoom_source module (and its dependencies) are loaded here
142
+source = ZoomSource()
143
+```
144
+
145
+---
146
+
147
+## Available Sources
148
+
149
+### Cloud Recordings
150
+
151
+Sources for fetching recorded meetings from video conferencing platforms.
152
+
153
+| Source | Class | Auth Method | Description |
154
+|---|---|---|---|
155
+| Zoom | `ZoomSource` | OAuth / Server-to-Server | List and download Zoom cloud recordings |
156
+| Google Meet | `MeetRecordingSource` | OAuth (Google) | List and download Google Meet recordings from Drive |
157
+| Microsoft Teams | `TeamsRecordingSource` | OAuth (Microsoft) | List and download Teams meeting recordings |
158
+
159
+### Cloud Storage and Workspace
160
+
161
+Sources for accessing files stored in cloud platforms.
162
+
163
+| Source | Class | Auth Method | Description |
164
+|---|---|---|---|
165
+| Google Drive | `GoogleDriveSource` | OAuth (Google) | Files from Google Drive |
166
+| Google Workspace | `GWSSource` | OAuth (Google) | Google Docs, Sheets, Slides |
167
+| Microsoft 365 | `M365Source` | OAuth (Microsoft) | OneDrive, SharePoint files |
168
+| Notion | `NotionSource` | OAuth / API key | Notion pages and databases |
169
+| GitHub | `GitHubSource` | OAuth / API token | Repository files, issues, discussions |
170
+| Dropbox | `DropboxSource` | OAuth / access token | *(via auth config)* |
171
+
172
+### Notes Applications
173
+
174
+Sources for local and cloud-based note-taking apps.
175
+
176
+| Source | Class | Auth Method | Description |
177
+|---|---|---|---|
178
+| Apple Notes | `AppleNotesSource` | Local (macOS) | Notes from Apple Notes.app |
179
+| Obsidian | `ObsidianSource` | Local filesystem | Markdown files from Obsidian vaults |
180
+| Logseq | `LogseqSource` | Local filesystem | Pages from Logseq graphs |
181
+| OneNote | `OneNoteSource` | OAuth (Microsoft) | Microsoft OneNote notebooks |
182
+| Google Keep | `GoogleKeepSource` | OAuth (Google) | Google Keep notes |
183
+
184
+### Web and Content
185
+
186
+Sources for fetching content from the web.
187
+
188
+| Source | Class | Auth Method | Description |
189
+|---|---|---|---|
190
+| YouTube | `YouTubeSource` | API key / OAuth | YouTube video metadata and transcripts |
191
+| Web | `WebSource` | None | General web page content extraction |
192
+| RSS | `RSSSource` | None | RSS/Atom feed entries |
193
+| Podcast | `PodcastSource` | None | Podcast episodes from RSS feeds |
194
+| arXiv | `ArxivSource` | None | Academic papers from arXiv |
195
+| Hacker News | `HackerNewsSource` | None | Hacker News posts and comments |
196
+| Reddit | `RedditSource` | API credentials | Reddit posts and comments |
197
+| Twitter/X | `TwitterSource` | API credentials | Tweets and threads |
198
+
199
+---
200
+
201
+## Auth Integration
202
+
203
+Most sources use PlanOpticon's unified auth system (see [Auth API](auth.md)). The typical pattern within a source implementation:
204
+
205
+```python
206
+from video_processor.auth import get_auth_manager
207
+
208
+class MySource(BaseSource):
209
+ def __init__(self):
210
+ self._token = None
211
+
212
+ def authenticate(self) -> bool:
213
+ manager = get_auth_manager("my_service")
214
+ if manager:
215
+ token = manager.get_token()
216
+ if token:
217
+ self._token = token
218
+ return True
219
+ return False
220
+
221
+ def list_videos(self, **kwargs) -> list[SourceFile]:
222
+ if not self._token:
223
+ raise RuntimeError("Not authenticated. Call authenticate() first.")
224
+ # Use self._token to call the API
225
+ ...
226
+```
227
+
228
+---
229
+
230
+## Usage Examples
231
+
232
+### Listing and downloading Zoom recordings
233
+
234
+```python
235
+from pathlib import Path
236
+from video_processor.sources import ZoomSource
237
+
238
+source = ZoomSource()
239
+if source.authenticate():
240
+ recordings = source.list_videos()
241
+ for rec in recordings:
242
+ print(f"{rec.name} ({rec.size_bytes} bytes)")
243
+
244
+ # Download all to a local directory
245
+ paths = source.download_all(recordings, Path("./downloads"))
246
+```
247
+
248
+### Fetching from multiple sources
249
+
250
+```python
251
+from pathlib import Path
252
+from video_processor.sources import GoogleDriveSource, NotionSource
253
+
254
+# Google Drive
255
+gdrive = GoogleDriveSource()
256
+if gdrive.authenticate():
257
+ files = gdrive.list_videos(
258
+ folder_path="Meeting Recordings",
259
+ patterns=["*.mp4", "*.webm"],
260
+ )
261
+ gdrive.download_all(files, Path("./drive-downloads"))
262
+
263
+# Notion
264
+notion = NotionSource()
265
+if notion.authenticate():
266
+ pages = notion.list_videos() # Lists Notion pages
267
+ for page in pages:
268
+ print(f"Page: {page.name}")
269
+```
270
+
271
+### YouTube content
272
+
273
+```python
274
+from video_processor.sources import YouTubeSource
275
+
276
+yt = YouTubeSource()
277
+if yt.authenticate():
278
+ videos = yt.list_videos(folder_path="https://youtube.com/playlist?list=...")
279
+ for v in videos:
280
+ print(f"{v.name} - {v.id}")
281
+```
--- a/docs/api/sources.md
+++ b/docs/api/sources.md
@@ -0,0 +1,281 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/api/sources.md
+++ b/docs/api/sources.md
@@ -0,0 +1,281 @@
1 # Sources API Reference
2
3 ::: video_processor.sources.base
4
5 ---
6
7 ## Overview
8
9 The sources module provides a unified interface for fetching content from cloud services, local applications, and the web. All sources implement the `BaseSource` abstract class, providing consistent `authenticate()`, `list_videos()`, and `download()` methods.
10
11 Sources are lazy-loaded to avoid pulling in optional dependencies at import time. You can import any source directly from `video_processor.sources` and the correct module will be loaded on demand.
12
13 ---
14
15 ## BaseSource (ABC)
16
17 ```python
18 from video_processor.sources import BaseSource
19 ```
20
21 Abstract base class that all source integrations implement. Defines the standard three-step workflow: authenticate, list, download.
22
23 ### authenticate()
24
25 ```python
26 @abstractmethod
27 def authenticate(self) -> bool
28 ```
29
30 Authenticate with the cloud provider or service. Uses the auth strategy defined for the source (OAuth, API key, local access, etc.).
31
32 **Returns:** `bool` -- `True` on successful authentication, `False` on failure.
33
34 ### list_videos()
35
36 ```python
37 @abstractmethod
38 def list_videos(
39 self,
40 folder_id: Optional[str] = None,
41 folder_path: Optional[str] = None,
42 patterns: Optional[List[str]] = None,
43 ) -> List[SourceFile]
44 ```
45
46 List available video files (or other content, depending on the source).
47
48 **Parameters:**
49
50 | Parameter | Type | Default | Description |
51 |---|---|---|---|
52 | `folder_id` | `Optional[str]` | `None` | Provider-specific folder/container identifier |
53 | `folder_path` | `Optional[str]` | `None` | Path within the source (e.g., folder name) |
54 | `patterns` | `Optional[List[str]]` | `None` | File name glob patterns to filter results |
55
56 **Returns:** `List[SourceFile]` -- available files matching the criteria.
57
58 ### download()
59
60 ```python
61 @abstractmethod
62 def download(
63 self,
64 file: SourceFile,
65 destination: Path,
66 ) -> Path
67 ```
68
69 Download a single file to a local path.
70
71 **Parameters:**
72
73 | Parameter | Type | Description |
74 |---|---|---|
75 | `file` | `SourceFile` | File descriptor from `list_videos()` |
76 | `destination` | `Path` | Local destination path |
77
78 **Returns:** `Path` -- the local path where the file was saved.
79
80 ### download_all()
81
82 ```python
83 def download_all(
84 self,
85 files: List[SourceFile],
86 destination_dir: Path,
87 ) -> List[Path]
88 ```
89
90 Download multiple files to a directory, preserving subfolder structure from `SourceFile.path`. This is a concrete method provided by the base class.
91
92 **Parameters:**
93
94 | Parameter | Type | Description |
95 |---|---|---|
96 | `files` | `List[SourceFile]` | Files to download |
97 | `destination_dir` | `Path` | Base directory for downloads (created if needed) |
98
99 **Returns:** `List[Path]` -- local paths of successfully downloaded files. Failed downloads are logged and skipped.
100
101 ---
102
103 ## SourceFile
104
105 ```python
106 from video_processor.sources import SourceFile
107 ```
108
109 Pydantic model describing a file available in a cloud source.
110
111 | Field | Type | Default | Description |
112 |---|---|---|---|
113 | `name` | `str` | *required* | File name |
114 | `id` | `str` | *required* | Provider-specific file identifier |
115 | `size_bytes` | `Optional[int]` | `None` | File size in bytes |
116 | `mime_type` | `Optional[str]` | `None` | MIME type (e.g., `"video/mp4"`) |
117 | `modified_at` | `Optional[str]` | `None` | Last modified timestamp |
118 | `path` | `Optional[str]` | `None` | Path within the source folder (used for subfolder structure in `download_all`) |
119
120 ```json
121 {
122 "name": "sprint-review-2026-03-01.mp4",
123 "id": "abc123def456",
124 "size_bytes": 524288000,
125 "mime_type": "video/mp4",
126 "modified_at": "2026-03-01T14:30:00Z",
127 "path": "recordings/march/sprint-review-2026-03-01.mp4"
128 }
129 ```
130
131 ---
132
133 ## Lazy Loading Pattern
134
135 All sources are lazy-loaded via `__getattr__` in the package `__init__.py`. This means importing `video_processor.sources` does not pull in any external dependencies (e.g., `google-auth`, `msal`, `notion-client`). The actual module is loaded only when you access the class.
136
137 ```python
138 # This import is instant -- no dependencies loaded
139 from video_processor.sources import ZoomSource
140
141 # The zoom_source module (and its dependencies) are loaded here
142 source = ZoomSource()
143 ```
144
145 ---
146
147 ## Available Sources
148
149 ### Cloud Recordings
150
151 Sources for fetching recorded meetings from video conferencing platforms.
152
153 | Source | Class | Auth Method | Description |
154 |---|---|---|---|
155 | Zoom | `ZoomSource` | OAuth / Server-to-Server | List and download Zoom cloud recordings |
156 | Google Meet | `MeetRecordingSource` | OAuth (Google) | List and download Google Meet recordings from Drive |
157 | Microsoft Teams | `TeamsRecordingSource` | OAuth (Microsoft) | List and download Teams meeting recordings |
158
159 ### Cloud Storage and Workspace
160
161 Sources for accessing files stored in cloud platforms.
162
163 | Source | Class | Auth Method | Description |
164 |---|---|---|---|
165 | Google Drive | `GoogleDriveSource` | OAuth (Google) | Files from Google Drive |
166 | Google Workspace | `GWSSource` | OAuth (Google) | Google Docs, Sheets, Slides |
167 | Microsoft 365 | `M365Source` | OAuth (Microsoft) | OneDrive, SharePoint files |
168 | Notion | `NotionSource` | OAuth / API key | Notion pages and databases |
169 | GitHub | `GitHubSource` | OAuth / API token | Repository files, issues, discussions |
170 | Dropbox | `DropboxSource` | OAuth / access token | *(via auth config)* |
171
172 ### Notes Applications
173
174 Sources for local and cloud-based note-taking apps.
175
176 | Source | Class | Auth Method | Description |
177 |---|---|---|---|
178 | Apple Notes | `AppleNotesSource` | Local (macOS) | Notes from Apple Notes.app |
179 | Obsidian | `ObsidianSource` | Local filesystem | Markdown files from Obsidian vaults |
180 | Logseq | `LogseqSource` | Local filesystem | Pages from Logseq graphs |
181 | OneNote | `OneNoteSource` | OAuth (Microsoft) | Microsoft OneNote notebooks |
182 | Google Keep | `GoogleKeepSource` | OAuth (Google) | Google Keep notes |
183
184 ### Web and Content
185
186 Sources for fetching content from the web.
187
188 | Source | Class | Auth Method | Description |
189 |---|---|---|---|
190 | YouTube | `YouTubeSource` | API key / OAuth | YouTube video metadata and transcripts |
191 | Web | `WebSource` | None | General web page content extraction |
192 | RSS | `RSSSource` | None | RSS/Atom feed entries |
193 | Podcast | `PodcastSource` | None | Podcast episodes from RSS feeds |
194 | arXiv | `ArxivSource` | None | Academic papers from arXiv |
195 | Hacker News | `HackerNewsSource` | None | Hacker News posts and comments |
196 | Reddit | `RedditSource` | API credentials | Reddit posts and comments |
197 | Twitter/X | `TwitterSource` | API credentials | Tweets and threads |
198
199 ---
200
201 ## Auth Integration
202
203 Most sources use PlanOpticon's unified auth system (see [Auth API](auth.md)). The typical pattern within a source implementation:
204
205 ```python
206 from video_processor.auth import get_auth_manager
207
208 class MySource(BaseSource):
209 def __init__(self):
210 self._token = None
211
212 def authenticate(self) -> bool:
213 manager = get_auth_manager("my_service")
214 if manager:
215 token = manager.get_token()
216 if token:
217 self._token = token
218 return True
219 return False
220
221 def list_videos(self, **kwargs) -> list[SourceFile]:
222 if not self._token:
223 raise RuntimeError("Not authenticated. Call authenticate() first.")
224 # Use self._token to call the API
225 ...
226 ```
227
228 ---
229
230 ## Usage Examples
231
232 ### Listing and downloading Zoom recordings
233
234 ```python
235 from pathlib import Path
236 from video_processor.sources import ZoomSource
237
238 source = ZoomSource()
239 if source.authenticate():
240 recordings = source.list_videos()
241 for rec in recordings:
242 print(f"{rec.name} ({rec.size_bytes} bytes)")
243
244 # Download all to a local directory
245 paths = source.download_all(recordings, Path("./downloads"))
246 ```
247
248 ### Fetching from multiple sources
249
250 ```python
251 from pathlib import Path
252 from video_processor.sources import GoogleDriveSource, NotionSource
253
254 # Google Drive
255 gdrive = GoogleDriveSource()
256 if gdrive.authenticate():
257 files = gdrive.list_videos(
258 folder_path="Meeting Recordings",
259 patterns=["*.mp4", "*.webm"],
260 )
261 gdrive.download_all(files, Path("./drive-downloads"))
262
263 # Notion
264 notion = NotionSource()
265 if notion.authenticate():
266 pages = notion.list_videos() # Lists Notion pages
267 for page in pages:
268 print(f"Page: {page.name}")
269 ```
270
271 ### YouTube content
272
273 ```python
274 from video_processor.sources import YouTubeSource
275
276 yt = YouTubeSource()
277 if yt.authenticate():
278 videos = yt.list_videos(folder_path="https://youtube.com/playlist?list=...")
279 for v in videos:
280 print(f"{v.name} - {v.id}")
281 ```
--- docs/architecture/pipeline.md
+++ docs/architecture/pipeline.md
@@ -1,8 +1,14 @@
11
# Processing Pipeline
2
+
3
+PlanOpticon has four main pipelines: **video analysis**, **document ingestion**, **source connector**, and **export**. Each pipeline can operate independently, and they connect through the shared knowledge graph.
4
+
5
+---
26
37
## Single video pipeline
8
+
9
+The core video analysis pipeline processes a single video file through eight sequential steps with checkpoint/resume support.
410
511
```mermaid
612
sequenceDiagram
713
participant CLI
814
participant Pipeline
@@ -9,49 +15,321 @@
915
participant FrameExtractor
1016
participant AudioExtractor
1117
participant Provider
1218
participant DiagramAnalyzer
1319
participant KnowledgeGraph
20
+ participant Exporter
1421
1522
CLI->>Pipeline: process_single_video()
23
+
24
+ Note over Pipeline: Step 1: Extract frames
1625
Pipeline->>FrameExtractor: extract_frames()
1726
Note over FrameExtractor: Change detection + periodic capture (every 30s)
27
+ FrameExtractor-->>Pipeline: frame_paths[]
28
+
29
+ Note over Pipeline: Step 2: Filter people frames
1830
Pipeline->>Pipeline: filter_people_frames()
1931
Note over Pipeline: OpenCV face detection removes webcam/people frames
32
+
33
+ Note over Pipeline: Step 3: Extract + transcribe audio
2034
Pipeline->>AudioExtractor: extract_audio()
2135
Pipeline->>Provider: transcribe_audio()
36
+ Note over Provider: Supports speaker hints via --speakers flag
37
+
38
+ Note over Pipeline: Step 4: Analyze visuals
2239
Pipeline->>DiagramAnalyzer: process_frames()
23
-
24
- loop Each frame
40
+ loop Each frame (up to 10 standard / 20 comprehensive)
2541
DiagramAnalyzer->>Provider: classify (vision)
2642
alt High confidence diagram
2743
DiagramAnalyzer->>Provider: full analysis
44
+ Note over Provider: Extract description, text, mermaid, chart data
2845
else Medium confidence
2946
DiagramAnalyzer-->>Pipeline: screengrab fallback
3047
end
3148
end
3249
50
+ Note over Pipeline: Step 5: Build knowledge graph
51
+ Pipeline->>KnowledgeGraph: register_source()
3352
Pipeline->>KnowledgeGraph: process_transcript()
3453
Pipeline->>KnowledgeGraph: process_diagrams()
54
+ Note over KnowledgeGraph: Writes knowledge_graph.db (SQLite) + .json
55
+
56
+ Note over Pipeline: Step 6: Extract key points + action items
3557
Pipeline->>Provider: extract key points
3658
Pipeline->>Provider: extract action items
37
- Pipeline->>Pipeline: generate reports
38
- Pipeline->>Pipeline: export formats
59
+
60
+ Note over Pipeline: Step 7: Generate report
61
+ Pipeline->>Pipeline: generate markdown report
62
+ Note over Pipeline: Includes mermaid diagrams, tables, cross-references
63
+
64
+ Note over Pipeline: Step 8: Export formats
65
+ Pipeline->>Exporter: export_all_formats()
66
+ Note over Exporter: HTML report, PDF, SVG/PNG renderings, chart reproductions
67
+
3968
Pipeline-->>CLI: VideoManifest
4069
```
70
+
71
+### Pipeline steps in detail
72
+
73
+| Step | Name | Checkpointable | Description |
74
+|------|------|----------------|-------------|
75
+| 1 | Extract frames | Yes | Change detection + periodic capture. Skipped if `frames/frame_*.jpg` exist on disk. |
76
+| 2 | Filter people frames | No | Inline with step 1. OpenCV face detection removes webcam frames. |
77
+| 3 | Extract + transcribe audio | Yes | Skipped if `transcript/transcript.json` exists. Speaker hints passed if `--speakers` provided. |
78
+| 4 | Analyze visuals | Yes | Skipped if `diagrams/` is populated. Evenly samples frames (not just first N). |
79
+| 5 | Build knowledge graph | Yes | Skipped if `results/knowledge_graph.db` exists. Registers source, processes transcript and diagrams. |
80
+| 6 | Extract key points + actions | Yes | Skipped if `results/key_points.json` and `results/action_items.json` exist. |
81
+| 7 | Generate report | Yes | Skipped if `results/analysis.md` exists. |
82
+| 8 | Export formats | No | Always runs. Renders mermaid to SVG/PNG, reproduces charts, generates HTML/PDF. |
83
+
84
+---
4185
4286
## Batch pipeline
4387
44
-The batch command wraps the single-video pipeline:
88
+The batch pipeline wraps the single-video pipeline and adds cross-video knowledge graph merging.
89
+
90
+```mermaid
91
+flowchart TD
92
+ A[Scan input directory] --> B[Match video files by pattern]
93
+ B --> C{For each video}
94
+ C --> D[process_single_video]
95
+ D --> E{Success?}
96
+ E -->|Yes| F[Collect manifest + KG]
97
+ E -->|No| G[Log error, continue]
98
+ F --> H[Next video]
99
+ G --> H
100
+ H --> C
101
+ C -->|All done| I[Merge knowledge graphs]
102
+ I --> J[Fuzzy matching + conflict resolution]
103
+ J --> K[Generate batch summary]
104
+ K --> L[Write batch manifest]
105
+ L --> M[batch_manifest.json + batch_summary.md + merged KG]
106
+```
107
+
108
+### Knowledge graph merge strategy
109
+
110
+During batch merging, `KnowledgeGraph.merge()` applies:
111
+
112
+1. **Case-insensitive exact matching** for entity names
113
+2. **Fuzzy matching** via `SequenceMatcher` (threshold >= 0.85) for near-duplicates
114
+3. **Type conflict resolution** using a specificity ranking (e.g., `technology` > `concept`)
115
+4. **Description union** across all sources
116
+5. **Relationship deduplication** by (source, target, type) tuple
117
+
118
+---
119
+
120
+## Document ingestion pipeline
121
+
122
+The document ingestion pipeline processes files (Markdown, plaintext, PDF) into knowledge graphs without video analysis.
123
+
124
+```mermaid
125
+flowchart TD
126
+ A[Input: file or directory] --> B{File or directory?}
127
+ B -->|File| C[get_processor by extension]
128
+ B -->|Directory| D[Glob for supported extensions]
129
+ D --> E{Recursive?}
130
+ E -->|Yes| F[rglob all files]
131
+ E -->|No| G[glob top-level only]
132
+ F --> H[For each file]
133
+ G --> H
134
+ H --> C
135
+ C --> I[DocumentProcessor.process]
136
+ I --> J[DocumentChunk list]
137
+ J --> K[Register source in KG]
138
+ K --> L[Add chunks as content]
139
+ L --> M[KG extracts entities + relationships]
140
+ M --> N[knowledge_graph.db]
141
+```
142
+
143
+### Supported document types
144
+
145
+| Extension | Processor | Notes |
146
+|-----------|-----------|-------|
147
+| `.md` | `MarkdownProcessor` | Splits by headings into sections |
148
+| `.txt` | `PlaintextProcessor` | Splits into fixed-size chunks |
149
+| `.pdf` | `PdfProcessor` | Requires `pymupdf` or `pdfplumber`. Falls back gracefully between libraries. |
150
+
151
+### Adding documents to an existing graph
152
+
153
+The `--db-path` flag lets you ingest documents into an existing knowledge graph:
154
+
155
+```bash
156
+planopticon ingest spec.md --db-path existing.db
157
+planopticon ingest ./docs/ -o ./output --recursive
158
+```
159
+
160
+---
161
+
162
+## Source connector pipeline
163
+
164
+Source connectors fetch content from cloud services, note-taking apps, and web sources. Each source implements the `BaseSource` ABC with three methods: `authenticate()`, `list_videos()`, and `download()`.
165
+
166
+```mermaid
167
+flowchart TD
168
+ A[Source command] --> B[Authenticate with provider]
169
+ B --> C{Auth success?}
170
+ C -->|No| D[Error: check credentials]
171
+ C -->|Yes| E[List files in folder]
172
+ E --> F[Filter by pattern / type]
173
+ F --> G[Download to local path]
174
+ G --> H{Analyze or ingest?}
175
+ H -->|Video| I[process_single_video / batch]
176
+ H -->|Document| J[ingest_file / ingest_directory]
177
+ I --> K[Knowledge graph]
178
+ J --> K
179
+```
180
+
181
+### Available sources
182
+
183
+PlanOpticon includes connectors for:
184
+
185
+| Category | Sources |
186
+|----------|---------|
187
+| Cloud storage | Google Drive, S3, Dropbox |
188
+| Meeting recordings | Zoom, Google Meet, Microsoft Teams |
189
+| Productivity suites | Google Workspace (Docs/Sheets/Slides), Microsoft 365 (SharePoint/OneDrive/OneNote) |
190
+| Note-taking apps | Obsidian, Logseq, Apple Notes, Google Keep, Notion |
191
+| Web sources | YouTube, Web (URL), RSS, Podcasts |
192
+| Developer platforms | GitHub, arXiv |
193
+| Social media | Reddit, Twitter/X, Hacker News |
194
+
195
+Each source authenticates via environment variables (API keys, OAuth tokens) specific to the provider.
196
+
197
+---
198
+
199
+## Planning agent pipeline
200
+
201
+The planning agent consumes a knowledge graph and uses registered skills to generate planning artifacts.
202
+
203
+```mermaid
204
+flowchart TD
205
+ A[Knowledge graph] --> B[Load into AgentContext]
206
+ B --> C[GraphQueryEngine]
207
+ C --> D[Taxonomy classification]
208
+ D --> E[Agent orchestrator]
209
+ E --> F{Select skill}
210
+ F --> G[ProjectPlan skill]
211
+ F --> H[PRD skill]
212
+ F --> I[Roadmap skill]
213
+ F --> J[TaskBreakdown skill]
214
+ F --> K[DocGenerator skill]
215
+ F --> L[WikiGenerator skill]
216
+ F --> M[NotesExport skill]
217
+ F --> N[ArtifactExport skill]
218
+ F --> O[GitHubIntegration skill]
219
+ F --> P[RequirementsChat skill]
220
+ G --> Q[Artifact output]
221
+ H --> Q
222
+ I --> Q
223
+ J --> Q
224
+ K --> Q
225
+ L --> Q
226
+ M --> Q
227
+ N --> Q
228
+ O --> Q
229
+ P --> Q
230
+ Q --> R[Write to disk / push to service]
231
+```
232
+
233
+### Skill execution flow
234
+
235
+1. The `AgentContext` is populated with the knowledge graph, query engine, provider manager, and any planning entities from taxonomy classification
236
+2. Each `Skill` checks `can_execute()` against the context (requires at minimum a knowledge graph and provider manager)
237
+3. The skill's `execute()` method generates an `Artifact` with a name, content, type, and format
238
+4. Artifacts are collected and can be exported to disk or pushed to external services (GitHub issues, wiki pages, etc.)
239
+
240
+---
241
+
242
+## Export pipeline
243
+
244
+The export pipeline converts knowledge graphs and analysis artifacts into various output formats.
245
+
246
+```mermaid
247
+flowchart TD
248
+ A[knowledge_graph.db] --> B{Export command}
249
+ B --> C[export markdown]
250
+ B --> D[export obsidian]
251
+ B --> E[export notion]
252
+ B --> F[export exchange]
253
+ B --> G[wiki generate]
254
+ B --> H[kg convert]
255
+ C --> I[7 document types + entity briefs + CSV]
256
+ D --> J[Obsidian vault with frontmatter + wiki-links]
257
+ E --> K[Notion-compatible markdown + CSV database]
258
+ F --> L[PlanOpticonExchange JSON payload]
259
+ G --> M[GitHub wiki pages + sidebar + home]
260
+ H --> N[Convert between .db / .json / .graphml / .csv]
261
+```
262
+
263
+All export commands accept a `knowledge_graph.db` (or `.json`) path as input. No API key is required for template-based exports (markdown, obsidian, notion, wiki, exchange, convert). Only the planning agent skills that generate new content require a provider.
264
+
265
+---
266
+
267
+## How pipelines connect
268
+
269
+```mermaid
270
+flowchart LR
271
+ V[Video files] --> VP[Video Pipeline]
272
+ D[Documents] --> DI[Document Ingestion]
273
+ S[Cloud Sources] --> SC[Source Connectors]
274
+ SC --> V
275
+ SC --> D
276
+ VP --> KG[(knowledge_graph.db)]
277
+ DI --> KG
278
+ KG --> QE[Query Engine]
279
+ KG --> EP[Export Pipeline]
280
+ KG --> PA[Planning Agent]
281
+ PA --> AR[Artifacts]
282
+ AR --> EP
283
+```
284
+
285
+All pipelines converge on the knowledge graph as the central data store. The knowledge graph is the shared interface between ingestion (video or document), querying, exporting, and planning.
45286
46
-1. Scan input directory for matching video files
47
-2. For each video: `process_single_video()` with error handling
48
-3. Merge knowledge graphs across all completed videos
49
-4. Generate batch summary with aggregated stats
50
-5. Write batch manifest
287
+---
51288
52289
## Error handling
53290
54
-- Individual video failures don't stop the batch
55
-- Failed videos are logged with error details in the manifest
56
-- Diagram analysis failures fall back to screengrabs
57
-- LLM extraction failures return empty results gracefully
291
+Error handling follows consistent patterns across all pipelines:
292
+
293
+| Scenario | Behavior |
294
+|----------|----------|
295
+| Video fails in batch | Batch continues. Failed video recorded in manifest with error details. |
296
+| Diagram analysis fails | Falls back to screengrab (captioned screenshot). |
297
+| LLM extraction fails | Returns empty results gracefully. Key points and action items will be empty arrays. |
298
+| Document processor not found | Raises `ValueError` with list of supported extensions. |
299
+| Source authentication fails | Returns `False` from `authenticate()`. CLI prints error message. |
300
+| Checkpoint file found | Step is skipped entirely and results are loaded from disk. |
301
+| Progress callback fails | Warning logged. Pipeline continues without progress updates. |
302
+
303
+---
304
+
305
+## Progress callback system
306
+
307
+The pipeline supports a `ProgressCallback` protocol for real-time progress tracking. This is used by the CLI's progress bars and can be implemented by external integrations (web UIs, CI systems, etc.).
308
+
309
+```python
310
+from video_processor.models import ProgressCallback
311
+
312
+class MyCallback:
313
+ def on_step_start(self, step: str, index: int, total: int) -> None:
314
+ print(f"Starting step {index}/{total}: {step}")
315
+
316
+ def on_step_complete(self, step: str, index: int, total: int) -> None:
317
+ print(f"Completed step {index}/{total}: {step}")
318
+
319
+ def on_progress(self, step: str, percent: float, message: str = "") -> None:
320
+ print(f" {step}: {percent:.0%} {message}")
321
+```
322
+
323
+Pass the callback to `process_single_video()`:
324
+
325
+```python
326
+from video_processor.pipeline import process_single_video
327
+
328
+manifest = process_single_video(
329
+ input_path="recording.mp4",
330
+ output_dir="./output",
331
+ progress_callback=MyCallback(),
332
+)
333
+```
334
+
335
+The callback methods are called within a try/except wrapper, so a failing callback never interrupts the pipeline. If a callback method raises an exception, a warning is logged and processing continues.
58336
--- docs/architecture/pipeline.md
+++ docs/architecture/pipeline.md
@@ -1,8 +1,14 @@
1 # Processing Pipeline
 
 
 
 
2
3 ## Single video pipeline
 
 
4
5 ```mermaid
6 sequenceDiagram
7 participant CLI
8 participant Pipeline
@@ -9,49 +15,321 @@
9 participant FrameExtractor
10 participant AudioExtractor
11 participant Provider
12 participant DiagramAnalyzer
13 participant KnowledgeGraph
 
14
15 CLI->>Pipeline: process_single_video()
 
 
16 Pipeline->>FrameExtractor: extract_frames()
17 Note over FrameExtractor: Change detection + periodic capture (every 30s)
 
 
 
18 Pipeline->>Pipeline: filter_people_frames()
19 Note over Pipeline: OpenCV face detection removes webcam/people frames
 
 
20 Pipeline->>AudioExtractor: extract_audio()
21 Pipeline->>Provider: transcribe_audio()
 
 
 
22 Pipeline->>DiagramAnalyzer: process_frames()
23
24 loop Each frame
25 DiagramAnalyzer->>Provider: classify (vision)
26 alt High confidence diagram
27 DiagramAnalyzer->>Provider: full analysis
 
28 else Medium confidence
29 DiagramAnalyzer-->>Pipeline: screengrab fallback
30 end
31 end
32
 
 
33 Pipeline->>KnowledgeGraph: process_transcript()
34 Pipeline->>KnowledgeGraph: process_diagrams()
 
 
 
35 Pipeline->>Provider: extract key points
36 Pipeline->>Provider: extract action items
37 Pipeline->>Pipeline: generate reports
38 Pipeline->>Pipeline: export formats
 
 
 
 
 
 
 
39 Pipeline-->>CLI: VideoManifest
40 ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
42 ## Batch pipeline
43
44 The batch command wraps the single-video pipeline:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
46 1. Scan input directory for matching video files
47 2. For each video: `process_single_video()` with error handling
48 3. Merge knowledge graphs across all completed videos
49 4. Generate batch summary with aggregated stats
50 5. Write batch manifest
51
52 ## Error handling
53
54 - Individual video failures don't stop the batch
55 - Failed videos are logged with error details in the manifest
56 - Diagram analysis failures fall back to screengrabs
57 - LLM extraction failures return empty results gracefully
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
--- docs/architecture/pipeline.md
+++ docs/architecture/pipeline.md
@@ -1,8 +1,14 @@
1 # Processing Pipeline
2
3 PlanOpticon has four main pipelines: **video analysis**, **document ingestion**, **source connector**, and **export**. Each pipeline can operate independently, and they connect through the shared knowledge graph.
4
5 ---
6
7 ## Single video pipeline
8
9 The core video analysis pipeline processes a single video file through eight sequential steps with checkpoint/resume support.
10
11 ```mermaid
12 sequenceDiagram
13 participant CLI
14 participant Pipeline
@@ -9,49 +15,321 @@
15 participant FrameExtractor
16 participant AudioExtractor
17 participant Provider
18 participant DiagramAnalyzer
19 participant KnowledgeGraph
20 participant Exporter
21
22 CLI->>Pipeline: process_single_video()
23
24 Note over Pipeline: Step 1: Extract frames
25 Pipeline->>FrameExtractor: extract_frames()
26 Note over FrameExtractor: Change detection + periodic capture (every 30s)
27 FrameExtractor-->>Pipeline: frame_paths[]
28
29 Note over Pipeline: Step 2: Filter people frames
30 Pipeline->>Pipeline: filter_people_frames()
31 Note over Pipeline: OpenCV face detection removes webcam/people frames
32
33 Note over Pipeline: Step 3: Extract + transcribe audio
34 Pipeline->>AudioExtractor: extract_audio()
35 Pipeline->>Provider: transcribe_audio()
36 Note over Provider: Supports speaker hints via --speakers flag
37
38 Note over Pipeline: Step 4: Analyze visuals
39 Pipeline->>DiagramAnalyzer: process_frames()
40 loop Each frame (up to 10 standard / 20 comprehensive)
 
41 DiagramAnalyzer->>Provider: classify (vision)
42 alt High confidence diagram
43 DiagramAnalyzer->>Provider: full analysis
44 Note over Provider: Extract description, text, mermaid, chart data
45 else Medium confidence
46 DiagramAnalyzer-->>Pipeline: screengrab fallback
47 end
48 end
49
50 Note over Pipeline: Step 5: Build knowledge graph
51 Pipeline->>KnowledgeGraph: register_source()
52 Pipeline->>KnowledgeGraph: process_transcript()
53 Pipeline->>KnowledgeGraph: process_diagrams()
54 Note over KnowledgeGraph: Writes knowledge_graph.db (SQLite) + .json
55
56 Note over Pipeline: Step 6: Extract key points + action items
57 Pipeline->>Provider: extract key points
58 Pipeline->>Provider: extract action items
59
60 Note over Pipeline: Step 7: Generate report
61 Pipeline->>Pipeline: generate markdown report
62 Note over Pipeline: Includes mermaid diagrams, tables, cross-references
63
64 Note over Pipeline: Step 8: Export formats
65 Pipeline->>Exporter: export_all_formats()
66 Note over Exporter: HTML report, PDF, SVG/PNG renderings, chart reproductions
67
68 Pipeline-->>CLI: VideoManifest
69 ```
70
71 ### Pipeline steps in detail
72
73 | Step | Name | Checkpointable | Description |
74 |------|------|----------------|-------------|
75 | 1 | Extract frames | Yes | Change detection + periodic capture. Skipped if `frames/frame_*.jpg` exist on disk. |
76 | 2 | Filter people frames | No | Inline with step 1. OpenCV face detection removes webcam frames. |
77 | 3 | Extract + transcribe audio | Yes | Skipped if `transcript/transcript.json` exists. Speaker hints passed if `--speakers` provided. |
78 | 4 | Analyze visuals | Yes | Skipped if `diagrams/` is populated. Evenly samples frames (not just first N). |
79 | 5 | Build knowledge graph | Yes | Skipped if `results/knowledge_graph.db` exists. Registers source, processes transcript and diagrams. |
80 | 6 | Extract key points + actions | Yes | Skipped if `results/key_points.json` and `results/action_items.json` exist. |
81 | 7 | Generate report | Yes | Skipped if `results/analysis.md` exists. |
82 | 8 | Export formats | No | Always runs. Renders mermaid to SVG/PNG, reproduces charts, generates HTML/PDF. |
83
84 ---
85
86 ## Batch pipeline
87
88 The batch pipeline wraps the single-video pipeline and adds cross-video knowledge graph merging.
89
90 ```mermaid
91 flowchart TD
92 A[Scan input directory] --> B[Match video files by pattern]
93 B --> C{For each video}
94 C --> D[process_single_video]
95 D --> E{Success?}
96 E -->|Yes| F[Collect manifest + KG]
97 E -->|No| G[Log error, continue]
98 F --> H[Next video]
99 G --> H
100 H --> C
101 C -->|All done| I[Merge knowledge graphs]
102 I --> J[Fuzzy matching + conflict resolution]
103 J --> K[Generate batch summary]
104 K --> L[Write batch manifest]
105 L --> M[batch_manifest.json + batch_summary.md + merged KG]
106 ```
107
108 ### Knowledge graph merge strategy
109
110 During batch merging, `KnowledgeGraph.merge()` applies:
111
112 1. **Case-insensitive exact matching** for entity names
113 2. **Fuzzy matching** via `SequenceMatcher` (threshold >= 0.85) for near-duplicates
114 3. **Type conflict resolution** using a specificity ranking (e.g., `technology` > `concept`)
115 4. **Description union** across all sources
116 5. **Relationship deduplication** by (source, target, type) tuple
117
118 ---
119
120 ## Document ingestion pipeline
121
122 The document ingestion pipeline processes files (Markdown, plaintext, PDF) into knowledge graphs without video analysis.
123
124 ```mermaid
125 flowchart TD
126 A[Input: file or directory] --> B{File or directory?}
127 B -->|File| C[get_processor by extension]
128 B -->|Directory| D[Glob for supported extensions]
129 D --> E{Recursive?}
130 E -->|Yes| F[rglob all files]
131 E -->|No| G[glob top-level only]
132 F --> H[For each file]
133 G --> H
134 H --> C
135 C --> I[DocumentProcessor.process]
136 I --> J[DocumentChunk list]
137 J --> K[Register source in KG]
138 K --> L[Add chunks as content]
139 L --> M[KG extracts entities + relationships]
140 M --> N[knowledge_graph.db]
141 ```
142
143 ### Supported document types
144
145 | Extension | Processor | Notes |
146 |-----------|-----------|-------|
147 | `.md` | `MarkdownProcessor` | Splits by headings into sections |
148 | `.txt` | `PlaintextProcessor` | Splits into fixed-size chunks |
149 | `.pdf` | `PdfProcessor` | Requires `pymupdf` or `pdfplumber`. Falls back gracefully between libraries. |
150
151 ### Adding documents to an existing graph
152
153 The `--db-path` flag lets you ingest documents into an existing knowledge graph:
154
155 ```bash
156 planopticon ingest spec.md --db-path existing.db
157 planopticon ingest ./docs/ -o ./output --recursive
158 ```
159
160 ---
161
162 ## Source connector pipeline
163
164 Source connectors fetch content from cloud services, note-taking apps, and web sources. Each source implements the `BaseSource` ABC with three methods: `authenticate()`, `list_videos()`, and `download()`.
165
166 ```mermaid
167 flowchart TD
168 A[Source command] --> B[Authenticate with provider]
169 B --> C{Auth success?}
170 C -->|No| D[Error: check credentials]
171 C -->|Yes| E[List files in folder]
172 E --> F[Filter by pattern / type]
173 F --> G[Download to local path]
174 G --> H{Analyze or ingest?}
175 H -->|Video| I[process_single_video / batch]
176 H -->|Document| J[ingest_file / ingest_directory]
177 I --> K[Knowledge graph]
178 J --> K
179 ```
180
181 ### Available sources
182
183 PlanOpticon includes connectors for:
184
185 | Category | Sources |
186 |----------|---------|
187 | Cloud storage | Google Drive, S3, Dropbox |
188 | Meeting recordings | Zoom, Google Meet, Microsoft Teams |
189 | Productivity suites | Google Workspace (Docs/Sheets/Slides), Microsoft 365 (SharePoint/OneDrive/OneNote) |
190 | Note-taking apps | Obsidian, Logseq, Apple Notes, Google Keep, Notion |
191 | Web sources | YouTube, Web (URL), RSS, Podcasts |
192 | Developer platforms | GitHub, arXiv |
193 | Social media | Reddit, Twitter/X, Hacker News |
194
195 Each source authenticates via environment variables (API keys, OAuth tokens) specific to the provider.
196
197 ---
198
199 ## Planning agent pipeline
200
201 The planning agent consumes a knowledge graph and uses registered skills to generate planning artifacts.
202
203 ```mermaid
204 flowchart TD
205 A[Knowledge graph] --> B[Load into AgentContext]
206 B --> C[GraphQueryEngine]
207 C --> D[Taxonomy classification]
208 D --> E[Agent orchestrator]
209 E --> F{Select skill}
210 F --> G[ProjectPlan skill]
211 F --> H[PRD skill]
212 F --> I[Roadmap skill]
213 F --> J[TaskBreakdown skill]
214 F --> K[DocGenerator skill]
215 F --> L[WikiGenerator skill]
216 F --> M[NotesExport skill]
217 F --> N[ArtifactExport skill]
218 F --> O[GitHubIntegration skill]
219 F --> P[RequirementsChat skill]
220 G --> Q[Artifact output]
221 H --> Q
222 I --> Q
223 J --> Q
224 K --> Q
225 L --> Q
226 M --> Q
227 N --> Q
228 O --> Q
229 P --> Q
230 Q --> R[Write to disk / push to service]
231 ```
232
233 ### Skill execution flow
234
235 1. The `AgentContext` is populated with the knowledge graph, query engine, provider manager, and any planning entities from taxonomy classification
236 2. Each `Skill` checks `can_execute()` against the context (requires at minimum a knowledge graph and provider manager)
237 3. The skill's `execute()` method generates an `Artifact` with a name, content, type, and format
238 4. Artifacts are collected and can be exported to disk or pushed to external services (GitHub issues, wiki pages, etc.)
239
240 ---
241
242 ## Export pipeline
243
244 The export pipeline converts knowledge graphs and analysis artifacts into various output formats.
245
246 ```mermaid
247 flowchart TD
248 A[knowledge_graph.db] --> B{Export command}
249 B --> C[export markdown]
250 B --> D[export obsidian]
251 B --> E[export notion]
252 B --> F[export exchange]
253 B --> G[wiki generate]
254 B --> H[kg convert]
255 C --> I[7 document types + entity briefs + CSV]
256 D --> J[Obsidian vault with frontmatter + wiki-links]
257 E --> K[Notion-compatible markdown + CSV database]
258 F --> L[PlanOpticonExchange JSON payload]
259 G --> M[GitHub wiki pages + sidebar + home]
260 H --> N[Convert between .db / .json / .graphml / .csv]
261 ```
262
263 All export commands accept a `knowledge_graph.db` (or `.json`) path as input. No API key is required for template-based exports (markdown, obsidian, notion, wiki, exchange, convert). Only the planning agent skills that generate new content require a provider.
264
265 ---
266
267 ## How pipelines connect
268
269 ```mermaid
270 flowchart LR
271 V[Video files] --> VP[Video Pipeline]
272 D[Documents] --> DI[Document Ingestion]
273 S[Cloud Sources] --> SC[Source Connectors]
274 SC --> V
275 SC --> D
276 VP --> KG[(knowledge_graph.db)]
277 DI --> KG
278 KG --> QE[Query Engine]
279 KG --> EP[Export Pipeline]
280 KG --> PA[Planning Agent]
281 PA --> AR[Artifacts]
282 AR --> EP
283 ```
284
285 All pipelines converge on the knowledge graph as the central data store. The knowledge graph is the shared interface between ingestion (video or document), querying, exporting, and planning.
286
287 ---
 
 
 
 
288
289 ## Error handling
290
291 Error handling follows consistent patterns across all pipelines:
292
293 | Scenario | Behavior |
294 |----------|----------|
295 | Video fails in batch | Batch continues. Failed video recorded in manifest with error details. |
296 | Diagram analysis fails | Falls back to screengrab (captioned screenshot). |
297 | LLM extraction fails | Returns empty results gracefully. Key points and action items will be empty arrays. |
298 | Document processor not found | Raises `ValueError` with list of supported extensions. |
299 | Source authentication fails | Returns `False` from `authenticate()`. CLI prints error message. |
300 | Checkpoint file found | Step is skipped entirely and results are loaded from disk. |
301 | Progress callback fails | Warning logged. Pipeline continues without progress updates. |
302
303 ---
304
305 ## Progress callback system
306
307 The pipeline supports a `ProgressCallback` protocol for real-time progress tracking. This is used by the CLI's progress bars and can be implemented by external integrations (web UIs, CI systems, etc.).
308
309 ```python
310 from video_processor.models import ProgressCallback
311
312 class MyCallback:
313 def on_step_start(self, step: str, index: int, total: int) -> None:
314 print(f"Starting step {index}/{total}: {step}")
315
316 def on_step_complete(self, step: str, index: int, total: int) -> None:
317 print(f"Completed step {index}/{total}: {step}")
318
319 def on_progress(self, step: str, percent: float, message: str = "") -> None:
320 print(f" {step}: {percent:.0%} {message}")
321 ```
322
323 Pass the callback to `process_single_video()`:
324
325 ```python
326 from video_processor.pipeline import process_single_video
327
328 manifest = process_single_video(
329 input_path="recording.mp4",
330 output_dir="./output",
331 progress_callback=MyCallback(),
332 )
333 ```
334
335 The callback methods are called within a try/except wrapper, so a failing callback never interrupts the pipeline. If a callback method raises an exception, a warning is logged and processing continues.
336
--- docs/contributing.md
+++ docs/contributing.md
@@ -10,54 +10,485 @@
1010
pip install -e ".[dev]"
1111
```
1212
1313
## Running tests
1414
15
+PlanOpticon has 822+ tests covering providers, pipeline stages, document processors, knowledge graph operations, exporters, skills, and CLI commands.
16
+
1517
```bash
1618
# Run all tests
1719
pytest tests/ -v
1820
1921
# Run with coverage
2022
pytest tests/ --cov=video_processor --cov-report=html
2123
2224
# Run a specific test file
2325
pytest tests/test_models.py -v
26
+
27
+# Run tests matching a keyword
28
+pytest tests/ -k "test_knowledge_graph" -v
29
+
30
+# Run only fast tests (skip slow integration tests)
31
+pytest tests/ -m "not slow" -v
32
+```
33
+
34
+### Test conventions
35
+
36
+- All tests live in the `tests/` directory, mirroring the `video_processor/` package structure
37
+- Test files are named `test_<module>.py`
38
+- Use `pytest` as the test runner -- do not use `unittest.TestCase` unless necessary for specific setup/teardown patterns
39
+- Mock external API calls. Never make real API calls in tests. Use `unittest.mock.patch` or `pytest-mock` fixtures to mock provider responses.
40
+- Use `tmp_path` (pytest fixture) for any tests that write files to disk
41
+- Fixtures shared across test files go in `conftest.py`
42
+- For testing CLI commands, use `click.testing.CliRunner`
43
+- For testing provider implementations, mock at the HTTP client level (e.g., patch `requests.post` or the provider's SDK client)
44
+
45
+### Mocking patterns
46
+
47
+```python
48
+# Mocking a provider's chat method
49
+from unittest.mock import MagicMock, patch
50
+
51
+def test_key_point_extraction():
52
+ pm = MagicMock()
53
+ pm.chat.return_value = '["Point 1", "Point 2"]'
54
+ result = extract_key_points(pm, "transcript text")
55
+ assert len(result) == 2
56
+
57
+# Mocking an external API at the HTTP level
58
+@patch("requests.post")
59
+def test_provider_chat(mock_post):
60
+ mock_post.return_value.json.return_value = {
61
+ "choices": [{"message": {"content": "response"}}]
62
+ }
63
+ provider = OpenAIProvider(api_key="test")
64
+ result = provider.chat([{"role": "user", "content": "hello"}])
65
+ assert result == "response"
2466
```
2567
2668
## Code style
2769
2870
We use:
2971
30
-- **Ruff** for linting
31
-- **Black** for formatting (100 char line length)
32
-- **isort** for import sorting
72
+- **Ruff** for both linting and formatting (100 char line length)
3373
- **mypy** for type checking
74
+
75
+Ruff handles all linting (error, warning, pyflakes, and import sorting rules) and formatting in a single tool. There is no need to run Black or isort separately.
3476
3577
```bash
78
+# Lint
3679
ruff check video_processor/
37
-black video_processor/
38
-isort video_processor/
80
+
81
+# Format
82
+ruff format video_processor/
83
+
84
+# Auto-fix lint issues
85
+ruff check video_processor/ --fix
86
+
87
+# Type check
3988
mypy video_processor/ --ignore-missing-imports
4089
```
4190
91
+### Ruff configuration
92
+
93
+The project's `pyproject.toml` configures ruff as follows:
94
+
95
+```toml
96
+[tool.ruff]
97
+line-length = 100
98
+target-version = "py310"
99
+
100
+[tool.ruff.lint]
101
+select = ["E", "F", "W", "I"]
102
+```
103
+
104
+The `I` rule set covers import sorting (equivalent to isort), so imports are automatically organized by ruff.
105
+
42106
## Project structure
43107
44
-See [Architecture Overview](architecture/overview.md) for the module structure.
108
+```
109
+PlanOpticon/
110
+├── video_processor/
111
+│ ├── cli/ # Click CLI commands
112
+│ │ └── commands.py
113
+│ ├── providers/ # LLM/API provider implementations
114
+│ │ ├── base.py # BaseProvider, ProviderRegistry
115
+│ │ ├── manager.py # ProviderManager
116
+│ │ ├── discovery.py # Auto-discovery of available providers
117
+│ │ ├── openai_provider.py
118
+│ │ ├── anthropic_provider.py
119
+│ │ ├── gemini_provider.py
120
+│ │ └── ... # 15+ provider implementations
121
+│ ├── sources/ # Cloud and web source connectors
122
+│ │ ├── base.py # BaseSource, SourceFile
123
+│ │ ├── google_drive.py
124
+│ │ ├── zoom_source.py
125
+│ │ └── ... # 20+ source implementations
126
+│ ├── processors/ # Document processors
127
+│ │ ├── base.py # DocumentProcessor, registry
128
+│ │ ├── ingest.py # File/directory ingestion
129
+│ │ ├── markdown_processor.py
130
+│ │ ├── pdf_processor.py
131
+│ │ └── __init__.py # Auto-registration of built-in processors
132
+│ ├── integrators/ # Knowledge graph and analysis
133
+│ │ ├── knowledge_graph.py # KnowledgeGraph class
134
+│ │ ├── graph_store.py # SQLite graph storage
135
+│ │ ├── graph_query.py # GraphQueryEngine
136
+│ │ ├── graph_discovery.py # Auto-find knowledge_graph.db
137
+│ │ └── taxonomy.py # Planning taxonomy classifier
138
+│ ├── agent/ # Planning agent
139
+│ │ ├── orchestrator.py # Agent orchestration
140
+│ │ └── skills/ # Skill implementations
141
+│ │ ├── base.py # Skill ABC, registry, Artifact
142
+│ │ ├── project_plan.py
143
+│ │ ├── prd.py
144
+│ │ ├── roadmap.py
145
+│ │ ├── task_breakdown.py
146
+│ │ ├── doc_generator.py
147
+│ │ ├── wiki_generator.py
148
+│ │ ├── notes_export.py
149
+│ │ ├── artifact_export.py
150
+│ │ ├── github_integration.py
151
+│ │ ├── requirements_chat.py
152
+│ │ ├── cli_adapter.py
153
+│ │ └── __init__.py # Auto-registration of skills
154
+│ ├── exporters/ # Output format exporters
155
+│ │ ├── __init__.py
156
+│ │ └── markdown.py # Template-based markdown generation
157
+│ ├── utils/ # Shared utilities
158
+│ │ ├── export.py # Multi-format export orchestration
159
+│ │ ├── rendering.py # Mermaid/chart rendering
160
+│ │ ├── prompt_templates.py
161
+│ │ ├── callbacks.py # Progress callback helpers
162
+│ │ └── ...
163
+│ ├── exchange.py # PlanOpticonExchange format
164
+│ ├── pipeline.py # Main video processing pipeline
165
+│ ├── models.py # Pydantic data models
166
+│ └── output_structure.py # Output directory helpers
167
+├── tests/ # 822+ tests
168
+├── knowledge-base/ # Local-first graph tools
169
+│ ├── viewer.html # Self-contained D3.js graph viewer
170
+│ └── query.py # Python query script (NetworkX)
171
+├── docs/ # MkDocs documentation
172
+└── pyproject.toml # Project configuration
173
+```
174
+
175
+See [Architecture Overview](architecture/overview.md) for a more detailed breakdown of module responsibilities.
45176
46177
## Adding a new provider
47178
179
+Providers self-register via `ProviderRegistry.register()` at module level. When the provider module is imported, it registers itself automatically.
180
+
48181
1. Create `video_processor/providers/your_provider.py`
49182
2. Extend `BaseProvider` from `video_processor/providers/base.py`
50
-3. Implement `chat()`, `analyze_image()`, `transcribe_audio()`, `list_models()`
51
-4. Register in `video_processor/providers/discovery.py`
52
-5. Add tests in `tests/test_providers.py`
183
+3. Implement the four required methods: `chat()`, `analyze_image()`, `transcribe_audio()`, `list_models()`
184
+4. Call `ProviderRegistry.register()` at module level
185
+5. Add the import to `video_processor/providers/manager.py` in the lazy-import block
186
+6. Add tests in `tests/test_providers.py`
187
+
188
+### Example provider skeleton
189
+
190
+```python
191
+"""Your provider implementation."""
192
+
193
+from video_processor.providers.base import BaseProvider, ModelInfo, ProviderRegistry
194
+
195
+
196
+class YourProvider(BaseProvider):
197
+ provider_name = "yourprovider"
198
+
199
+ def __init__(self, api_key: str | None = None):
200
+ import os
201
+ self.api_key = api_key or os.environ.get("YOUR_API_KEY", "")
202
+
203
+ def chat(self, messages, max_tokens=4096, temperature=0.7, model=None):
204
+ # Implement chat completion
205
+ ...
206
+
207
+ def analyze_image(self, image_bytes, prompt, max_tokens=4096, model=None):
208
+ # Implement image analysis
209
+ ...
210
+
211
+ def transcribe_audio(self, audio_path, language=None, model=None):
212
+ # Implement audio transcription (or raise NotImplementedError)
213
+ ...
214
+
215
+ def list_models(self):
216
+ return [ModelInfo(id="your-model", provider="yourprovider", capabilities=["chat"])]
217
+
218
+
219
+# Self-registration at import time
220
+ProviderRegistry.register(
221
+ "yourprovider",
222
+ YourProvider,
223
+ env_var="YOUR_API_KEY",
224
+ model_prefixes=["your-"],
225
+ default_models={"chat": "your-model"},
226
+)
227
+```
228
+
229
+### OpenAI-compatible providers
230
+
231
+For providers that use the OpenAI API format, extend `OpenAICompatibleProvider` instead of `BaseProvider`. This provides default implementations of `chat()`, `analyze_image()`, and `list_models()` -- you only need to configure the base URL and model mappings.
232
+
233
+```python
234
+from video_processor.providers.base import OpenAICompatibleProvider, ProviderRegistry
235
+
236
+class YourProvider(OpenAICompatibleProvider):
237
+ provider_name = "yourprovider"
238
+ base_url = "https://api.yourprovider.com/v1"
239
+ env_var = "YOUR_API_KEY"
240
+
241
+ProviderRegistry.register("yourprovider", YourProvider, env_var="YOUR_API_KEY")
242
+```
53243
54244
## Adding a new cloud source
55245
246
+Source connectors implement the `BaseSource` ABC from `video_processor/sources/base.py`. Authentication is handled per-source, typically via environment variables.
247
+
56248
1. Create `video_processor/sources/your_source.py`
57
-2. Implement auth flow and file listing/downloading
58
-3. Add CLI integration in `video_processor/cli/commands.py`
59
-4. Add tests and docs
249
+2. Extend `BaseSource`
250
+3. Implement `authenticate()`, `list_videos()`, and `download()`
251
+4. Add the class to the lazy-import map in `video_processor/sources/__init__.py`
252
+5. Add CLI commands in `video_processor/cli/commands.py` if needed
253
+6. Add tests and documentation
254
+
255
+### Example source skeleton
256
+
257
+```python
258
+"""Your source integration."""
259
+
260
+import os
261
+import logging
262
+from pathlib import Path
263
+from typing import List, Optional
264
+
265
+from video_processor.sources.base import BaseSource, SourceFile
266
+
267
+logger = logging.getLogger(__name__)
268
+
269
+
270
+class YourSource(BaseSource):
271
+ def __init__(self, api_key: Optional[str] = None):
272
+ self.api_key = api_key or os.environ.get("YOUR_SOURCE_KEY", "")
273
+
274
+ def authenticate(self) -> bool:
275
+ """Validate credentials. Return True on success."""
276
+ if not self.api_key:
277
+ logger.error("API key not set. Set YOUR_SOURCE_KEY env var.")
278
+ return False
279
+ # Make a test API call to verify credentials
280
+ ...
281
+ return True
282
+
283
+ def list_videos(
284
+ self,
285
+ folder_id: Optional[str] = None,
286
+ folder_path: Optional[str] = None,
287
+ patterns: Optional[List[str]] = None,
288
+ ) -> List[SourceFile]:
289
+ """List available video files."""
290
+ ...
291
+
292
+ def download(self, file: SourceFile, destination: Path) -> Path:
293
+ """Download a single file. Return the local path."""
294
+ destination.parent.mkdir(parents=True, exist_ok=True)
295
+ # Download file content to destination
296
+ ...
297
+ return destination
298
+```
299
+
300
+### Registering in `__init__.py`
301
+
302
+Add your source to the `__all__` list and the `_lazy_map` dictionary in `video_processor/sources/__init__.py`:
303
+
304
+```python
305
+__all__ = [
306
+ ...
307
+ "YourSource",
308
+]
309
+
310
+_lazy_map = {
311
+ ...
312
+ "YourSource": "video_processor.sources.your_source",
313
+}
314
+```
315
+
316
+## Adding a new skill
317
+
318
+Agent skills extend the `Skill` ABC from `video_processor/agent/skills/base.py` and self-register via `register_skill()`.
319
+
320
+1. Create `video_processor/agent/skills/your_skill.py`
321
+2. Extend `Skill` and set `name` and `description` class attributes
322
+3. Implement `execute()` to return an `Artifact`
323
+4. Optionally override `can_execute()` for custom precondition checks
324
+5. Call `register_skill()` at module level
325
+6. Add the import to `video_processor/agent/skills/__init__.py`
326
+7. Add tests
327
+
328
+### Example skill skeleton
329
+
330
+```python
331
+"""Your custom skill."""
332
+
333
+from video_processor.agent.skills.base import AgentContext, Artifact, Skill, register_skill
334
+
335
+
336
+class YourSkill(Skill):
337
+ name = "your_skill"
338
+ description = "Generates a custom artifact from the knowledge graph."
339
+
340
+ def execute(self, context: AgentContext, **kwargs) -> Artifact:
341
+ """Generate the artifact."""
342
+ kg_data = context.knowledge_graph.to_dict()
343
+ # Build content from knowledge graph data
344
+ content = f"# Your Artifact\n\n{len(kg_data.get('entities', []))} entities found."
345
+ return Artifact(
346
+ name="your_artifact",
347
+ content=content,
348
+ artifact_type="document",
349
+ format="markdown",
350
+ )
351
+
352
+ def can_execute(self, context: AgentContext) -> bool:
353
+ """Check prerequisites (default requires KG + provider)."""
354
+ return context.knowledge_graph is not None
355
+
356
+
357
+# Self-registration at import time
358
+register_skill(YourSkill())
359
+```
360
+
361
+### Registering in `__init__.py`
362
+
363
+Add the import to `video_processor/agent/skills/__init__.py` so the skill is loaded (and self-registered) when the skills package is imported:
364
+
365
+```python
366
+from video_processor.agent.skills import (
367
+ ...
368
+ your_skill, # noqa: F401
369
+)
370
+```
371
+
372
+## Adding a new document processor
373
+
374
+Document processors extend the `DocumentProcessor` ABC from `video_processor/processors/base.py` and are registered via `register_processor()`.
375
+
376
+1. Create `video_processor/processors/your_processor.py`
377
+2. Extend `DocumentProcessor`
378
+3. Set `supported_extensions` class attribute
379
+4. Implement `process()` (returns `List[DocumentChunk]`) and `can_process()`
380
+5. Call `register_processor()` at module level
381
+6. Add the import to `video_processor/processors/__init__.py`
382
+7. Add tests
383
+
384
+### Example processor skeleton
385
+
386
+```python
387
+"""Your document processor."""
388
+
389
+from pathlib import Path
390
+from typing import List
391
+
392
+from video_processor.processors.base import (
393
+ DocumentChunk,
394
+ DocumentProcessor,
395
+ register_processor,
396
+)
397
+
398
+
399
+class YourProcessor(DocumentProcessor):
400
+ supported_extensions = [".xyz", ".abc"]
401
+
402
+ def can_process(self, path: Path) -> bool:
403
+ return path.suffix.lower() in self.supported_extensions
404
+
405
+ def process(self, path: Path) -> List[DocumentChunk]:
406
+ text = path.read_text()
407
+ # Split into chunks as appropriate for your format
408
+ return [
409
+ DocumentChunk(
410
+ text=text,
411
+ source_file=str(path),
412
+ chunk_index=0,
413
+ metadata={"format": "xyz"},
414
+ )
415
+ ]
416
+
417
+
418
+# Self-registration at import time
419
+register_processor([".xyz", ".abc"], YourProcessor)
420
+```
421
+
422
+### Registering in `__init__.py`
423
+
424
+Add the import to `video_processor/processors/__init__.py`:
425
+
426
+```python
427
+from video_processor.processors import (
428
+ markdown_processor, # noqa: F401, E402
429
+ pdf_processor, # noqa: F401, E402
430
+ your_processor, # noqa: F401, E402
431
+)
432
+```
433
+
434
+## Adding a new exporter
435
+
436
+Exporters live in `video_processor/exporters/` and are typically called from CLI commands. There is no strict ABC for exporters -- they are plain functions that accept knowledge graph data and an output directory.
437
+
438
+1. Create `video_processor/exporters/your_exporter.py`
439
+2. Implement one or more export functions that accept KG data (as a dict) and an output path
440
+3. Add CLI integration in `video_processor/cli/commands.py` under the `export` group
441
+4. Add tests
442
+
443
+### Example exporter skeleton
444
+
445
+```python
446
+"""Your exporter."""
447
+
448
+import json
449
+from pathlib import Path
450
+from typing import List
451
+
452
+
453
+def export_your_format(kg_data: dict, output_dir: Path) -> List[Path]:
454
+ """Export knowledge graph data in your format.
455
+
456
+ Args:
457
+ kg_data: Knowledge graph as a dict (from KnowledgeGraph.to_dict()).
458
+ output_dir: Directory to write output files.
459
+
460
+ Returns:
461
+ List of created file paths.
462
+ """
463
+ output_dir.mkdir(parents=True, exist_ok=True)
464
+ created = []
465
+
466
+ output_file = output_dir / "export.xyz"
467
+ output_file.write_text(json.dumps(kg_data, indent=2))
468
+ created.append(output_file)
469
+
470
+ return created
471
+```
472
+
473
+### Adding the CLI command
474
+
475
+Add a subcommand under the `export` group in `video_processor/cli/commands.py`:
476
+
477
+```python
478
+@export.command("your-format")
479
+@click.argument("db_path", type=click.Path(exists=True))
480
+@click.option("-o", "--output", type=click.Path(), default=None)
481
+def export_your_format_cmd(db_path, output):
482
+ """Export knowledge graph in your format."""
483
+ from video_processor.exporters.your_exporter import export_your_format
484
+ from video_processor.integrators.knowledge_graph import KnowledgeGraph
485
+
486
+ kg = KnowledgeGraph(db_path=Path(db_path))
487
+ out_dir = Path(output) if output else Path.cwd() / "your-export"
488
+ created = export_your_format(kg.to_dict(), out_dir)
489
+ click.echo(f"Exported {len(created)} files to {out_dir}/")
490
+```
60491
61492
## License
62493
63
-MIT License — Copyright (c) 2025 CONFLICT LLC. All rights reserved.
494
+MIT License -- Copyright (c) 2026 CONFLICT LLC. All rights reserved.
64495
65496
ADDED docs/faq.md
--- docs/contributing.md
+++ docs/contributing.md
@@ -10,54 +10,485 @@
10 pip install -e ".[dev]"
11 ```
12
13 ## Running tests
14
 
 
15 ```bash
16 # Run all tests
17 pytest tests/ -v
18
19 # Run with coverage
20 pytest tests/ --cov=video_processor --cov-report=html
21
22 # Run a specific test file
23 pytest tests/test_models.py -v
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24 ```
25
26 ## Code style
27
28 We use:
29
30 - **Ruff** for linting
31 - **Black** for formatting (100 char line length)
32 - **isort** for import sorting
33 - **mypy** for type checking
 
 
34
35 ```bash
 
36 ruff check video_processor/
37 black video_processor/
38 isort video_processor/
 
 
 
 
 
 
39 mypy video_processor/ --ignore-missing-imports
40 ```
41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42 ## Project structure
43
44 See [Architecture Overview](architecture/overview.md) for the module structure.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
46 ## Adding a new provider
47
 
 
48 1. Create `video_processor/providers/your_provider.py`
49 2. Extend `BaseProvider` from `video_processor/providers/base.py`
50 3. Implement `chat()`, `analyze_image()`, `transcribe_audio()`, `list_models()`
51 4. Register in `video_processor/providers/discovery.py`
52 5. Add tests in `tests/test_providers.py`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53
54 ## Adding a new cloud source
55
 
 
56 1. Create `video_processor/sources/your_source.py`
57 2. Implement auth flow and file listing/downloading
58 3. Add CLI integration in `video_processor/cli/commands.py`
59 4. Add tests and docs
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60
61 ## License
62
63 MIT License — Copyright (c) 2025 CONFLICT LLC. All rights reserved.
64
65 DDED docs/faq.md
--- docs/contributing.md
+++ docs/contributing.md
@@ -10,54 +10,485 @@
10 pip install -e ".[dev]"
11 ```
12
13 ## Running tests
14
15 PlanOpticon has 822+ tests covering providers, pipeline stages, document processors, knowledge graph operations, exporters, skills, and CLI commands.
16
17 ```bash
18 # Run all tests
19 pytest tests/ -v
20
21 # Run with coverage
22 pytest tests/ --cov=video_processor --cov-report=html
23
24 # Run a specific test file
25 pytest tests/test_models.py -v
26
27 # Run tests matching a keyword
28 pytest tests/ -k "test_knowledge_graph" -v
29
30 # Run only fast tests (skip slow integration tests)
31 pytest tests/ -m "not slow" -v
32 ```
33
34 ### Test conventions
35
36 - All tests live in the `tests/` directory, mirroring the `video_processor/` package structure
37 - Test files are named `test_<module>.py`
38 - Use `pytest` as the test runner -- do not use `unittest.TestCase` unless necessary for specific setup/teardown patterns
39 - Mock external API calls. Never make real API calls in tests. Use `unittest.mock.patch` or `pytest-mock` fixtures to mock provider responses.
40 - Use `tmp_path` (pytest fixture) for any tests that write files to disk
41 - Fixtures shared across test files go in `conftest.py`
42 - For testing CLI commands, use `click.testing.CliRunner`
43 - For testing provider implementations, mock at the HTTP client level (e.g., patch `requests.post` or the provider's SDK client)
44
45 ### Mocking patterns
46
47 ```python
48 # Mocking a provider's chat method
49 from unittest.mock import MagicMock, patch
50
51 def test_key_point_extraction():
52 pm = MagicMock()
53 pm.chat.return_value = '["Point 1", "Point 2"]'
54 result = extract_key_points(pm, "transcript text")
55 assert len(result) == 2
56
57 # Mocking an external API at the HTTP level
58 @patch("requests.post")
59 def test_provider_chat(mock_post):
60 mock_post.return_value.json.return_value = {
61 "choices": [{"message": {"content": "response"}}]
62 }
63 provider = OpenAIProvider(api_key="test")
64 result = provider.chat([{"role": "user", "content": "hello"}])
65 assert result == "response"
66 ```
67
68 ## Code style
69
70 We use:
71
72 - **Ruff** for both linting and formatting (100 char line length)
 
 
73 - **mypy** for type checking
74
75 Ruff handles all linting (error, warning, pyflakes, and import sorting rules) and formatting in a single tool. There is no need to run Black or isort separately.
76
77 ```bash
78 # Lint
79 ruff check video_processor/
80
81 # Format
82 ruff format video_processor/
83
84 # Auto-fix lint issues
85 ruff check video_processor/ --fix
86
87 # Type check
88 mypy video_processor/ --ignore-missing-imports
89 ```
90
91 ### Ruff configuration
92
93 The project's `pyproject.toml` configures ruff as follows:
94
95 ```toml
96 [tool.ruff]
97 line-length = 100
98 target-version = "py310"
99
100 [tool.ruff.lint]
101 select = ["E", "F", "W", "I"]
102 ```
103
104 The `I` rule set covers import sorting (equivalent to isort), so imports are automatically organized by ruff.
105
106 ## Project structure
107
108 ```
109 PlanOpticon/
110 ├── video_processor/
111 │ ├── cli/ # Click CLI commands
112 │ │ └── commands.py
113 │ ├── providers/ # LLM/API provider implementations
114 │ │ ├── base.py # BaseProvider, ProviderRegistry
115 │ │ ├── manager.py # ProviderManager
116 │ │ ├── discovery.py # Auto-discovery of available providers
117 │ │ ├── openai_provider.py
118 │ │ ├── anthropic_provider.py
119 │ │ ├── gemini_provider.py
120 │ │ └── ... # 15+ provider implementations
121 │ ├── sources/ # Cloud and web source connectors
122 │ │ ├── base.py # BaseSource, SourceFile
123 │ │ ├── google_drive.py
124 │ │ ├── zoom_source.py
125 │ │ └── ... # 20+ source implementations
126 │ ├── processors/ # Document processors
127 │ │ ├── base.py # DocumentProcessor, registry
128 │ │ ├── ingest.py # File/directory ingestion
129 │ │ ├── markdown_processor.py
130 │ │ ├── pdf_processor.py
131 │ │ └── __init__.py # Auto-registration of built-in processors
132 │ ├── integrators/ # Knowledge graph and analysis
133 │ │ ├── knowledge_graph.py # KnowledgeGraph class
134 │ │ ├── graph_store.py # SQLite graph storage
135 │ │ ├── graph_query.py # GraphQueryEngine
136 │ │ ├── graph_discovery.py # Auto-find knowledge_graph.db
137 │ │ └── taxonomy.py # Planning taxonomy classifier
138 │ ├── agent/ # Planning agent
139 │ │ ├── orchestrator.py # Agent orchestration
140 │ │ └── skills/ # Skill implementations
141 │ │ ├── base.py # Skill ABC, registry, Artifact
142 │ │ ├── project_plan.py
143 │ │ ├── prd.py
144 │ │ ├── roadmap.py
145 │ │ ├── task_breakdown.py
146 │ │ ├── doc_generator.py
147 │ │ ├── wiki_generator.py
148 │ │ ├── notes_export.py
149 │ │ ├── artifact_export.py
150 │ │ ├── github_integration.py
151 │ │ ├── requirements_chat.py
152 │ │ ├── cli_adapter.py
153 │ │ └── __init__.py # Auto-registration of skills
154 │ ├── exporters/ # Output format exporters
155 │ │ ├── __init__.py
156 │ │ └── markdown.py # Template-based markdown generation
157 │ ├── utils/ # Shared utilities
158 │ │ ├── export.py # Multi-format export orchestration
159 │ │ ├── rendering.py # Mermaid/chart rendering
160 │ │ ├── prompt_templates.py
161 │ │ ├── callbacks.py # Progress callback helpers
162 │ │ └── ...
163 │ ├── exchange.py # PlanOpticonExchange format
164 │ ├── pipeline.py # Main video processing pipeline
165 │ ├── models.py # Pydantic data models
166 │ └── output_structure.py # Output directory helpers
167 ├── tests/ # 822+ tests
168 ├── knowledge-base/ # Local-first graph tools
169 │ ├── viewer.html # Self-contained D3.js graph viewer
170 │ └── query.py # Python query script (NetworkX)
171 ├── docs/ # MkDocs documentation
172 └── pyproject.toml # Project configuration
173 ```
174
175 See [Architecture Overview](architecture/overview.md) for a more detailed breakdown of module responsibilities.
176
177 ## Adding a new provider
178
179 Providers self-register via `ProviderRegistry.register()` at module level. When the provider module is imported, it registers itself automatically.
180
181 1. Create `video_processor/providers/your_provider.py`
182 2. Extend `BaseProvider` from `video_processor/providers/base.py`
183 3. Implement the four required methods: `chat()`, `analyze_image()`, `transcribe_audio()`, `list_models()`
184 4. Call `ProviderRegistry.register()` at module level
185 5. Add the import to `video_processor/providers/manager.py` in the lazy-import block
186 6. Add tests in `tests/test_providers.py`
187
188 ### Example provider skeleton
189
190 ```python
191 """Your provider implementation."""
192
193 from video_processor.providers.base import BaseProvider, ModelInfo, ProviderRegistry
194
195
196 class YourProvider(BaseProvider):
197 provider_name = "yourprovider"
198
199 def __init__(self, api_key: str | None = None):
200 import os
201 self.api_key = api_key or os.environ.get("YOUR_API_KEY", "")
202
203 def chat(self, messages, max_tokens=4096, temperature=0.7, model=None):
204 # Implement chat completion
205 ...
206
207 def analyze_image(self, image_bytes, prompt, max_tokens=4096, model=None):
208 # Implement image analysis
209 ...
210
211 def transcribe_audio(self, audio_path, language=None, model=None):
212 # Implement audio transcription (or raise NotImplementedError)
213 ...
214
215 def list_models(self):
216 return [ModelInfo(id="your-model", provider="yourprovider", capabilities=["chat"])]
217
218
219 # Self-registration at import time
220 ProviderRegistry.register(
221 "yourprovider",
222 YourProvider,
223 env_var="YOUR_API_KEY",
224 model_prefixes=["your-"],
225 default_models={"chat": "your-model"},
226 )
227 ```
228
229 ### OpenAI-compatible providers
230
231 For providers that use the OpenAI API format, extend `OpenAICompatibleProvider` instead of `BaseProvider`. This provides default implementations of `chat()`, `analyze_image()`, and `list_models()` -- you only need to configure the base URL and model mappings.
232
233 ```python
234 from video_processor.providers.base import OpenAICompatibleProvider, ProviderRegistry
235
236 class YourProvider(OpenAICompatibleProvider):
237 provider_name = "yourprovider"
238 base_url = "https://api.yourprovider.com/v1"
239 env_var = "YOUR_API_KEY"
240
241 ProviderRegistry.register("yourprovider", YourProvider, env_var="YOUR_API_KEY")
242 ```
243
244 ## Adding a new cloud source
245
246 Source connectors implement the `BaseSource` ABC from `video_processor/sources/base.py`. Authentication is handled per-source, typically via environment variables.
247
248 1. Create `video_processor/sources/your_source.py`
249 2. Extend `BaseSource`
250 3. Implement `authenticate()`, `list_videos()`, and `download()`
251 4. Add the class to the lazy-import map in `video_processor/sources/__init__.py`
252 5. Add CLI commands in `video_processor/cli/commands.py` if needed
253 6. Add tests and documentation
254
255 ### Example source skeleton
256
257 ```python
258 """Your source integration."""
259
260 import os
261 import logging
262 from pathlib import Path
263 from typing import List, Optional
264
265 from video_processor.sources.base import BaseSource, SourceFile
266
267 logger = logging.getLogger(__name__)
268
269
270 class YourSource(BaseSource):
271 def __init__(self, api_key: Optional[str] = None):
272 self.api_key = api_key or os.environ.get("YOUR_SOURCE_KEY", "")
273
274 def authenticate(self) -> bool:
275 """Validate credentials. Return True on success."""
276 if not self.api_key:
277 logger.error("API key not set. Set YOUR_SOURCE_KEY env var.")
278 return False
279 # Make a test API call to verify credentials
280 ...
281 return True
282
283 def list_videos(
284 self,
285 folder_id: Optional[str] = None,
286 folder_path: Optional[str] = None,
287 patterns: Optional[List[str]] = None,
288 ) -> List[SourceFile]:
289 """List available video files."""
290 ...
291
292 def download(self, file: SourceFile, destination: Path) -> Path:
293 """Download a single file. Return the local path."""
294 destination.parent.mkdir(parents=True, exist_ok=True)
295 # Download file content to destination
296 ...
297 return destination
298 ```
299
300 ### Registering in `__init__.py`
301
302 Add your source to the `__all__` list and the `_lazy_map` dictionary in `video_processor/sources/__init__.py`:
303
304 ```python
305 __all__ = [
306 ...
307 "YourSource",
308 ]
309
310 _lazy_map = {
311 ...
312 "YourSource": "video_processor.sources.your_source",
313 }
314 ```
315
316 ## Adding a new skill
317
318 Agent skills extend the `Skill` ABC from `video_processor/agent/skills/base.py` and self-register via `register_skill()`.
319
320 1. Create `video_processor/agent/skills/your_skill.py`
321 2. Extend `Skill` and set `name` and `description` class attributes
322 3. Implement `execute()` to return an `Artifact`
323 4. Optionally override `can_execute()` for custom precondition checks
324 5. Call `register_skill()` at module level
325 6. Add the import to `video_processor/agent/skills/__init__.py`
326 7. Add tests
327
328 ### Example skill skeleton
329
330 ```python
331 """Your custom skill."""
332
333 from video_processor.agent.skills.base import AgentContext, Artifact, Skill, register_skill
334
335
336 class YourSkill(Skill):
337 name = "your_skill"
338 description = "Generates a custom artifact from the knowledge graph."
339
340 def execute(self, context: AgentContext, **kwargs) -> Artifact:
341 """Generate the artifact."""
342 kg_data = context.knowledge_graph.to_dict()
343 # Build content from knowledge graph data
344 content = f"# Your Artifact\n\n{len(kg_data.get('entities', []))} entities found."
345 return Artifact(
346 name="your_artifact",
347 content=content,
348 artifact_type="document",
349 format="markdown",
350 )
351
352 def can_execute(self, context: AgentContext) -> bool:
353 """Check prerequisites (default requires KG + provider)."""
354 return context.knowledge_graph is not None
355
356
357 # Self-registration at import time
358 register_skill(YourSkill())
359 ```
360
361 ### Registering in `__init__.py`
362
363 Add the import to `video_processor/agent/skills/__init__.py` so the skill is loaded (and self-registered) when the skills package is imported:
364
365 ```python
366 from video_processor.agent.skills import (
367 ...
368 your_skill, # noqa: F401
369 )
370 ```
371
372 ## Adding a new document processor
373
374 Document processors extend the `DocumentProcessor` ABC from `video_processor/processors/base.py` and are registered via `register_processor()`.
375
376 1. Create `video_processor/processors/your_processor.py`
377 2. Extend `DocumentProcessor`
378 3. Set `supported_extensions` class attribute
379 4. Implement `process()` (returns `List[DocumentChunk]`) and `can_process()`
380 5. Call `register_processor()` at module level
381 6. Add the import to `video_processor/processors/__init__.py`
382 7. Add tests
383
384 ### Example processor skeleton
385
386 ```python
387 """Your document processor."""
388
389 from pathlib import Path
390 from typing import List
391
392 from video_processor.processors.base import (
393 DocumentChunk,
394 DocumentProcessor,
395 register_processor,
396 )
397
398
399 class YourProcessor(DocumentProcessor):
400 supported_extensions = [".xyz", ".abc"]
401
402 def can_process(self, path: Path) -> bool:
403 return path.suffix.lower() in self.supported_extensions
404
405 def process(self, path: Path) -> List[DocumentChunk]:
406 text = path.read_text()
407 # Split into chunks as appropriate for your format
408 return [
409 DocumentChunk(
410 text=text,
411 source_file=str(path),
412 chunk_index=0,
413 metadata={"format": "xyz"},
414 )
415 ]
416
417
418 # Self-registration at import time
419 register_processor([".xyz", ".abc"], YourProcessor)
420 ```
421
422 ### Registering in `__init__.py`
423
424 Add the import to `video_processor/processors/__init__.py`:
425
426 ```python
427 from video_processor.processors import (
428 markdown_processor, # noqa: F401, E402
429 pdf_processor, # noqa: F401, E402
430 your_processor, # noqa: F401, E402
431 )
432 ```
433
434 ## Adding a new exporter
435
436 Exporters live in `video_processor/exporters/` and are typically called from CLI commands. There is no strict ABC for exporters -- they are plain functions that accept knowledge graph data and an output directory.
437
438 1. Create `video_processor/exporters/your_exporter.py`
439 2. Implement one or more export functions that accept KG data (as a dict) and an output path
440 3. Add CLI integration in `video_processor/cli/commands.py` under the `export` group
441 4. Add tests
442
443 ### Example exporter skeleton
444
445 ```python
446 """Your exporter."""
447
448 import json
449 from pathlib import Path
450 from typing import List
451
452
453 def export_your_format(kg_data: dict, output_dir: Path) -> List[Path]:
454 """Export knowledge graph data in your format.
455
456 Args:
457 kg_data: Knowledge graph as a dict (from KnowledgeGraph.to_dict()).
458 output_dir: Directory to write output files.
459
460 Returns:
461 List of created file paths.
462 """
463 output_dir.mkdir(parents=True, exist_ok=True)
464 created = []
465
466 output_file = output_dir / "export.xyz"
467 output_file.write_text(json.dumps(kg_data, indent=2))
468 created.append(output_file)
469
470 return created
471 ```
472
473 ### Adding the CLI command
474
475 Add a subcommand under the `export` group in `video_processor/cli/commands.py`:
476
477 ```python
478 @export.command("your-format")
479 @click.argument("db_path", type=click.Path(exists=True))
480 @click.option("-o", "--output", type=click.Path(), default=None)
481 def export_your_format_cmd(db_path, output):
482 """Export knowledge graph in your format."""
483 from video_processor.exporters.your_exporter import export_your_format
484 from video_processor.integrators.knowledge_graph import KnowledgeGraph
485
486 kg = KnowledgeGraph(db_path=Path(db_path))
487 out_dir = Path(output) if output else Path.cwd() / "your-export"
488 created = export_your_format(kg.to_dict(), out_dir)
489 click.echo(f"Exported {len(created)} files to {out_dir}/")
490 ```
491
492 ## License
493
494 MIT License -- Copyright (c) 2026 CONFLICT LLC. All rights reserved.
495
496 DDED docs/faq.md
+301
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -0,0 +1,301 @@
1
+# FAQ & Troubleshooting
2
+
3
+## Frequently Asked Questions
4
+
5
+### Do I need an API key?
6
+
7
+You need at least one of:
8
+
9
+- **Cloud API key**: `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY`
10
+- **Local Ollama**: Install [Ollama](https://ollama.com), pull a model, and run `ollama serve`
11
+
12
+Some features work without any AI provider:
13
+
14
+- `planopticon query stats` — direct knowledge graph queries
15
+- `planopticon query "entities --type person"` — structured entity lookups
16
+- `planopticon export markdown` — document generation from existing KG (7 document types, no LLM)
17
+- `planopticon kg inspect` — knowledge graph statistics
18
+- `planopticon kg convert` — format conversion
19
+
20
+### How much does it cost?
21
+
22
+PlanOpticon defaults to cheap models to minimize costs:
23
+
24
+| Task | Default model | Approximate cost |
25
+|------|--------------|-----------------|
26
+| Chat/analysis | Claude Haiku / GPT-4o-mini | ~$0.25-0.50 per 1M tokens |
27
+| Vision (diagrams) | Gemini Flash / GPT-4o-mini | ~$0.10-0.50 per 1M tokens |
28
+| Transcription | Local Whisper (free) / Whisper-1 | $0.006/minute |
29
+
30
+A typical 1-hour meeting costs roughly $0.05-0.15 to process with default models. Use `--provider ollama` for zero cost.
31
+
32
+### Can I run fully offline?
33
+
34
+Yes. Install Ollama and local Whisper:
35
+
36
+```bash
37
+ollama pull llama3.2
38
+ollama pull llava
39
+pip install planopticon[gpu]
40
+planopticon analyze -i video.mp4 -o ./output --provider ollama
41
+```
42
+
43
+No data leaves your machine.
44
+
45
+### What video formats are supported?
46
+
47
+Any format FFmpeg can decode:
48
+
49
+- MP4, MKV, AVI, MOV, WebM, FLV, WMV, M4V
50
+- Container formats with common codecs (H.264, H.265, VP8, VP9, AV1)
51
+
52
+### What document formats can I ingest?
53
+
54
+- **PDF** — text extraction via pymupdf or pdfplumber
55
+- **Markdown** — parsed with heading-based chunking
56
+- **Plain text** — paragraph-based chunking with overlap
57
+
58
+### How does the knowledge graph work?
59
+
60
+PlanOpticon extracts entities (people, technologies, concepts, decisions) and relationships from your content. These are stored in a SQLite database (`knowledge_graph.db`) with zero external dependencies. Entities are automatically classified using a planning taxonomy (goals, requirements, risks, tasks, milestones).
61
+
62
+When you process multiple sources, entities are merged using fuzzy name matching (0.85 threshold) with type conflict resolution and provenance tracking.
63
+
64
+### Can I use PlanOpticon with my existing Obsidian vault?
65
+
66
+Yes, in both directions:
67
+
68
+```bash
69
+# Ingest an Obsidian vault into PlanOpticon
70
+planopticon ingest ~/Obsidian/MyVault --output ./kb --recursive
71
+
72
+# Export PlanOpticon knowledge to an Obsidian vault
73
+planopticon export obsidian --input ./kb --output ~/Obsidian/PlanOpticon
74
+```
75
+
76
+The Obsidian export produces proper YAML frontmatter, wiki-links (`[[Entity Name]]`), and tag pages.
77
+
78
+### How do I add my own AI provider?
79
+
80
+Create a provider module, extend `BaseProvider`, and register it:
81
+
82
+```python
83
+from video_processor.providers.base import BaseProvider, ProviderRegistry
84
+
85
+class MyProvider(BaseProvider):
86
+ provider_name = "myprovider"
87
+
88
+ def chat(self, messages, max_tokens=4096, temperature=0.7, model=None):
89
+ # Your implementation
90
+ ...
91
+
92
+ProviderRegistry.register(
93
+ name="myprovider",
94
+ provider_class=MyProvider,
95
+ env_var="MY_PROVIDER_API_KEY",
96
+ model_prefixes=["my-"],
97
+ default_models={"chat": "my-model-v1", "vision": "", "audio": ""},
98
+)
99
+```
100
+
101
+See the [Contributing guide](contributing.md) for details.
102
+
103
+---
104
+
105
+## Troubleshooting
106
+
107
+### Authentication errors
108
+
109
+#### "No auth method available for zoom"
110
+
111
+You need to set credentials before authenticating:
112
+
113
+```bash
114
+export ZOOM_CLIENT_ID="your-client-id"
115
+export ZOOM_CLIENT_SECRET="your-client-secret"
116
+planopticon auth zoom
117
+```
118
+
119
+The error message tells you which environment variables to set. Each service requires different credentials — see the [Authentication guide](guide/authentication.md).
120
+
121
+#### "Token expired" or "401 Unauthorized"
122
+
123
+Your saved token has expired and auto-refresh failed. Re-authenticate:
124
+
125
+```bash
126
+planopticon auth google # or whatever service
127
+```
128
+
129
+To clear a stale token:
130
+
131
+```bash
132
+planopticon auth google --logout
133
+planopticon auth google
134
+```
135
+
136
+Tokens are stored in `~/.planopticon/{service}_token.json`.
137
+
138
+#### OAuth redirect errors
139
+
140
+If the browser-based OAuth flow fails, check:
141
+
142
+1. Your client ID and secret are correct
143
+2. The redirect URI in your OAuth app matches PlanOpticon's default (`urn:ietf:wg:oauth:2.0:oob`)
144
+3. The OAuth app has the required scopes enabled
145
+
146
+### Provider errors
147
+
148
+#### "ANTHROPIC_API_KEY not set"
149
+
150
+Set at least one provider's API key:
151
+
152
+```bash
153
+export OPENAI_API_KEY="sk-..."
154
+# or
155
+export ANTHROPIC_API_KEY="sk-ant-..."
156
+# or
157
+export GEMINI_API_KEY="AI..."
158
+```
159
+
160
+Or use a `.env` file in your project directory.
161
+
162
+#### "Unexpected role system" (Anthropic)
163
+
164
+This was a bug in older versions where system messages were passed in the messages array instead of as a top-level parameter. Update to v0.4.0 or later.
165
+
166
+#### "Model not found" or "Invalid model"
167
+
168
+Check available models:
169
+
170
+```bash
171
+planopticon list-models
172
+```
173
+
174
+Common model name issues:
175
+- Anthropic: use `claude-haiku-4-5-20251001`, not `claude-haiku`
176
+- OpenAI: use `gpt-4o-mini`, not `gpt4o-mini`
177
+
178
+#### Rate limiting / 429 errors
179
+
180
+PlanOpticon doesn't currently implement automatic retry. If you hit rate limits:
181
+
182
+1. Use a different provider: `--provider gemini`
183
+2. Use cheaper/faster models: `--chat-model gpt-4o-mini`
184
+3. Reduce processing depth: `--depth basic`
185
+4. Use Ollama for zero rate limits: `--provider ollama`
186
+
187
+### Processing errors
188
+
189
+#### "FFmpeg not found"
190
+
191
+Install FFmpeg:
192
+
193
+```bash
194
+# macOS
195
+brew install ffmpeg
196
+
197
+# Ubuntu/Debian
198
+sudo apt-get install ffmpeg libsndfile1
199
+
200
+# Windows
201
+# Download from https://ffmpeg.org/download.html and add to PATH
202
+```
203
+
204
+#### "Audio extraction failed: no audio track found"
205
+
206
+The video file has no audio track. PlanOpticon will skip transcription and continue with frame analysis only.
207
+
208
+#### "Frame extraction memory error"
209
+
210
+For very long videos, frame extraction can use significant memory. Use the `--max-memory-mb` safety valve:
211
+
212
+```bash
213
+planopticon analyze -i long-video.mp4 -o ./output --max-memory-mb 2048
214
+```
215
+
216
+Or reduce the sampling rate:
217
+
218
+```bash
219
+planopticon analyze -i long-video.mp4 -o ./output --sampling-rate 0.25
220
+```
221
+
222
+#### Batch processing — one video fails
223
+
224
+Individual video failures don't stop the batch. Failed videos are logged in the batch manifest with error details. Check `batch_manifest.json` for the specific error.
225
+
226
+### Knowledge graph issues
227
+
228
+#### "No knowledge graph loaded" in companion
229
+
230
+The companion auto-discovers knowledge graphs by looking for `knowledge_graph.db` or `knowledge_graph.json` in the current directory and parent directories. Either:
231
+
232
+1. `cd` to the directory containing your knowledge graph
233
+2. Specify the path explicitly: `planopticon companion --kb ./path/to/kb`
234
+
235
+#### Empty or sparse knowledge graph
236
+
237
+Common causes:
238
+
239
+1. **Too few entities extracted**: Try `--depth comprehensive` for deeper analysis
240
+2. **Short or low-quality transcript**: Check `transcript/transcript.txt` — poor audio produces poor transcription
241
+3. **Wrong provider**: Some models extract entities better than others. Try `--provider openai --chat-model gpt-4o` for higher quality
242
+
243
+#### Duplicate entities after merge
244
+
245
+The fuzzy matching threshold is 0.85 (SequenceMatcher ratio). If you're getting duplicates, the names are too different for automatic matching. You can manually inspect and merge:
246
+
247
+```bash
248
+planopticon kg inspect ./knowledge_graph.db
249
+planopticon query "entities --name python"
250
+```
251
+
252
+### Companion / REPL issues
253
+
254
+#### Chat gives generic advice instead of project-specific answers
255
+
256
+The companion needs both a knowledge graph and an LLM provider. Check:
257
+
258
+```
259
+planopticon> /status
260
+```
261
+
262
+If it says "KG: not loaded" or "Provider: none", fix those first:
263
+
264
+```
265
+planopticon> /provider openai
266
+planopticon> /model gpt-4o-mini
267
+```
268
+
269
+#### Companion is slow
270
+
271
+The companion makes LLM API calls for chat messages. To speed things up:
272
+
273
+1. Use a faster model: `/model gpt-4o-mini` or `/model claude-haiku-4-5-20251001`
274
+2. Use direct queries instead of chat: `/entities`, `/search`, `/neighbors` don't need an LLM
275
+3. Use Ollama locally for lower latency: `/provider ollama`
276
+
277
+### Export issues
278
+
279
+#### Obsidian export has broken links
280
+
281
+Make sure your Obsidian vault has wiki-links enabled (Settings > Files & Links > Use [[Wikilinks]]). PlanOpticon exports use wiki-link syntax by default.
282
+
283
+#### PDF export fails
284
+
285
+PDF export requires the `pdf` extra:
286
+
287
+```bash
288
+pip install planopticon[pdf]
289
+```
290
+
291
+This installs WeasyPrint, which has system dependencies. On macOS:
292
+
293
+```bash
294
+brew install pango
295
+```
296
+
297
+On Ubuntu:
298
+
299
+```bash
300
+sudo apt-get install libpango1.0-dev
301
+```
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -0,0 +1,301 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -0,0 +1,301 @@
1 # FAQ & Troubleshooting
2
3 ## Frequently Asked Questions
4
5 ### Do I need an API key?
6
7 You need at least one of:
8
9 - **Cloud API key**: `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, or `GEMINI_API_KEY`
10 - **Local Ollama**: Install [Ollama](https://ollama.com), pull a model, and run `ollama serve`
11
12 Some features work without any AI provider:
13
14 - `planopticon query stats` — direct knowledge graph queries
15 - `planopticon query "entities --type person"` — structured entity lookups
16 - `planopticon export markdown` — document generation from existing KG (7 document types, no LLM)
17 - `planopticon kg inspect` — knowledge graph statistics
18 - `planopticon kg convert` — format conversion
19
20 ### How much does it cost?
21
22 PlanOpticon defaults to cheap models to minimize costs:
23
24 | Task | Default model | Approximate cost |
25 |------|--------------|-----------------|
26 | Chat/analysis | Claude Haiku / GPT-4o-mini | ~$0.25-0.50 per 1M tokens |
27 | Vision (diagrams) | Gemini Flash / GPT-4o-mini | ~$0.10-0.50 per 1M tokens |
28 | Transcription | Local Whisper (free) / Whisper-1 | $0.006/minute |
29
30 A typical 1-hour meeting costs roughly $0.05-0.15 to process with default models. Use `--provider ollama` for zero cost.
31
32 ### Can I run fully offline?
33
34 Yes. Install Ollama and local Whisper:
35
36 ```bash
37 ollama pull llama3.2
38 ollama pull llava
39 pip install planopticon[gpu]
40 planopticon analyze -i video.mp4 -o ./output --provider ollama
41 ```
42
43 No data leaves your machine.
44
45 ### What video formats are supported?
46
47 Any format FFmpeg can decode:
48
49 - MP4, MKV, AVI, MOV, WebM, FLV, WMV, M4V
50 - Container formats with common codecs (H.264, H.265, VP8, VP9, AV1)
51
52 ### What document formats can I ingest?
53
54 - **PDF** — text extraction via pymupdf or pdfplumber
55 - **Markdown** — parsed with heading-based chunking
56 - **Plain text** — paragraph-based chunking with overlap
57
58 ### How does the knowledge graph work?
59
60 PlanOpticon extracts entities (people, technologies, concepts, decisions) and relationships from your content. These are stored in a SQLite database (`knowledge_graph.db`) with zero external dependencies. Entities are automatically classified using a planning taxonomy (goals, requirements, risks, tasks, milestones).
61
62 When you process multiple sources, entities are merged using fuzzy name matching (0.85 threshold) with type conflict resolution and provenance tracking.
63
64 ### Can I use PlanOpticon with my existing Obsidian vault?
65
66 Yes, in both directions:
67
68 ```bash
69 # Ingest an Obsidian vault into PlanOpticon
70 planopticon ingest ~/Obsidian/MyVault --output ./kb --recursive
71
72 # Export PlanOpticon knowledge to an Obsidian vault
73 planopticon export obsidian --input ./kb --output ~/Obsidian/PlanOpticon
74 ```
75
76 The Obsidian export produces proper YAML frontmatter, wiki-links (`[[Entity Name]]`), and tag pages.
77
78 ### How do I add my own AI provider?
79
80 Create a provider module, extend `BaseProvider`, and register it:
81
82 ```python
83 from video_processor.providers.base import BaseProvider, ProviderRegistry
84
85 class MyProvider(BaseProvider):
86 provider_name = "myprovider"
87
88 def chat(self, messages, max_tokens=4096, temperature=0.7, model=None):
89 # Your implementation
90 ...
91
92 ProviderRegistry.register(
93 name="myprovider",
94 provider_class=MyProvider,
95 env_var="MY_PROVIDER_API_KEY",
96 model_prefixes=["my-"],
97 default_models={"chat": "my-model-v1", "vision": "", "audio": ""},
98 )
99 ```
100
101 See the [Contributing guide](contributing.md) for details.
102
103 ---
104
105 ## Troubleshooting
106
107 ### Authentication errors
108
109 #### "No auth method available for zoom"
110
111 You need to set credentials before authenticating:
112
113 ```bash
114 export ZOOM_CLIENT_ID="your-client-id"
115 export ZOOM_CLIENT_SECRET="your-client-secret"
116 planopticon auth zoom
117 ```
118
119 The error message tells you which environment variables to set. Each service requires different credentials — see the [Authentication guide](guide/authentication.md).
120
121 #### "Token expired" or "401 Unauthorized"
122
123 Your saved token has expired and auto-refresh failed. Re-authenticate:
124
125 ```bash
126 planopticon auth google # or whatever service
127 ```
128
129 To clear a stale token:
130
131 ```bash
132 planopticon auth google --logout
133 planopticon auth google
134 ```
135
136 Tokens are stored in `~/.planopticon/{service}_token.json`.
137
138 #### OAuth redirect errors
139
140 If the browser-based OAuth flow fails, check:
141
142 1. Your client ID and secret are correct
143 2. The redirect URI in your OAuth app matches PlanOpticon's default (`urn:ietf:wg:oauth:2.0:oob`)
144 3. The OAuth app has the required scopes enabled
145
146 ### Provider errors
147
148 #### "ANTHROPIC_API_KEY not set"
149
150 Set at least one provider's API key:
151
152 ```bash
153 export OPENAI_API_KEY="sk-..."
154 # or
155 export ANTHROPIC_API_KEY="sk-ant-..."
156 # or
157 export GEMINI_API_KEY="AI..."
158 ```
159
160 Or use a `.env` file in your project directory.
161
162 #### "Unexpected role system" (Anthropic)
163
164 This was a bug in older versions where system messages were passed in the messages array instead of as a top-level parameter. Update to v0.4.0 or later.
165
166 #### "Model not found" or "Invalid model"
167
168 Check available models:
169
170 ```bash
171 planopticon list-models
172 ```
173
174 Common model name issues:
175 - Anthropic: use `claude-haiku-4-5-20251001`, not `claude-haiku`
176 - OpenAI: use `gpt-4o-mini`, not `gpt4o-mini`
177
178 #### Rate limiting / 429 errors
179
180 PlanOpticon doesn't currently implement automatic retry. If you hit rate limits:
181
182 1. Use a different provider: `--provider gemini`
183 2. Use cheaper/faster models: `--chat-model gpt-4o-mini`
184 3. Reduce processing depth: `--depth basic`
185 4. Use Ollama for zero rate limits: `--provider ollama`
186
187 ### Processing errors
188
189 #### "FFmpeg not found"
190
191 Install FFmpeg:
192
193 ```bash
194 # macOS
195 brew install ffmpeg
196
197 # Ubuntu/Debian
198 sudo apt-get install ffmpeg libsndfile1
199
200 # Windows
201 # Download from https://ffmpeg.org/download.html and add to PATH
202 ```
203
204 #### "Audio extraction failed: no audio track found"
205
206 The video file has no audio track. PlanOpticon will skip transcription and continue with frame analysis only.
207
208 #### "Frame extraction memory error"
209
210 For very long videos, frame extraction can use significant memory. Use the `--max-memory-mb` safety valve:
211
212 ```bash
213 planopticon analyze -i long-video.mp4 -o ./output --max-memory-mb 2048
214 ```
215
216 Or reduce the sampling rate:
217
218 ```bash
219 planopticon analyze -i long-video.mp4 -o ./output --sampling-rate 0.25
220 ```
221
222 #### Batch processing — one video fails
223
224 Individual video failures don't stop the batch. Failed videos are logged in the batch manifest with error details. Check `batch_manifest.json` for the specific error.
225
226 ### Knowledge graph issues
227
228 #### "No knowledge graph loaded" in companion
229
230 The companion auto-discovers knowledge graphs by looking for `knowledge_graph.db` or `knowledge_graph.json` in the current directory and parent directories. Either:
231
232 1. `cd` to the directory containing your knowledge graph
233 2. Specify the path explicitly: `planopticon companion --kb ./path/to/kb`
234
235 #### Empty or sparse knowledge graph
236
237 Common causes:
238
239 1. **Too few entities extracted**: Try `--depth comprehensive` for deeper analysis
240 2. **Short or low-quality transcript**: Check `transcript/transcript.txt` — poor audio produces poor transcription
241 3. **Wrong provider**: Some models extract entities better than others. Try `--provider openai --chat-model gpt-4o` for higher quality
242
243 #### Duplicate entities after merge
244
245 The fuzzy matching threshold is 0.85 (SequenceMatcher ratio). If you're getting duplicates, the names are too different for automatic matching. You can manually inspect and merge:
246
247 ```bash
248 planopticon kg inspect ./knowledge_graph.db
249 planopticon query "entities --name python"
250 ```
251
252 ### Companion / REPL issues
253
254 #### Chat gives generic advice instead of project-specific answers
255
256 The companion needs both a knowledge graph and an LLM provider. Check:
257
258 ```
259 planopticon> /status
260 ```
261
262 If it says "KG: not loaded" or "Provider: none", fix those first:
263
264 ```
265 planopticon> /provider openai
266 planopticon> /model gpt-4o-mini
267 ```
268
269 #### Companion is slow
270
271 The companion makes LLM API calls for chat messages. To speed things up:
272
273 1. Use a faster model: `/model gpt-4o-mini` or `/model claude-haiku-4-5-20251001`
274 2. Use direct queries instead of chat: `/entities`, `/search`, `/neighbors` don't need an LLM
275 3. Use Ollama locally for lower latency: `/provider ollama`
276
277 ### Export issues
278
279 #### Obsidian export has broken links
280
281 Make sure your Obsidian vault has wiki-links enabled (Settings > Files & Links > Use [[Wikilinks]]). PlanOpticon exports use wiki-link syntax by default.
282
283 #### PDF export fails
284
285 PDF export requires the `pdf` extra:
286
287 ```bash
288 pip install planopticon[pdf]
289 ```
290
291 This installs WeasyPrint, which has system dependencies. On macOS:
292
293 ```bash
294 brew install pango
295 ```
296
297 On Ubuntu:
298
299 ```bash
300 sudo apt-get install libpango1.0-dev
301 ```
--- docs/getting-started/configuration.md
+++ docs/getting-started/configuration.md
@@ -1,45 +1,150 @@
11
# Configuration
22
3
-## Environment variables
3
+## Example `.env` file
4
+
5
+Create a `.env` file in your project directory. PlanOpticon loads it automatically.
6
+
7
+```bash
8
+# =============================================================================
9
+# PlanOpticon Configuration
10
+# =============================================================================
11
+# Copy this file to .env and fill in the values you need.
12
+# You only need ONE AI provider — PlanOpticon auto-detects which are available.
13
+
14
+# --- AI Providers (set at least one) ----------------------------------------
15
+
16
+# OpenAI — get your key at https://platform.openai.com/api-keys
17
+OPENAI_API_KEY=sk-...
18
+
19
+# Anthropic — get your key at https://console.anthropic.com/settings/keys
20
+ANTHROPIC_API_KEY=sk-ant-...
21
+
22
+# Google Gemini — get your key at https://aistudio.google.com/apikey
23
+GEMINI_API_KEY=AI...
24
+
25
+# Azure OpenAI — from your Azure portal deployment
26
+# AZURE_OPENAI_API_KEY=...
27
+# AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
28
+
29
+# Together AI — https://api.together.xyz/settings/api-keys
30
+# TOGETHER_API_KEY=...
31
+
32
+# Fireworks AI — https://fireworks.ai/account/api-keys
33
+# FIREWORKS_API_KEY=...
34
+
35
+# Cerebras — https://cloud.cerebras.ai/
36
+# CEREBRAS_API_KEY=...
37
+
38
+# xAI (Grok) — https://console.x.ai/
39
+# XAI_API_KEY=...
40
+
41
+# Ollama (local, no key needed) — just run: ollama serve
42
+# OLLAMA_HOST=http://localhost:11434
43
+
44
+# --- Google (Drive, Docs, Sheets, Meet, YouTube) ----------------------------
45
+# Option A: OAuth (interactive, recommended for personal use)
46
+# Create credentials at https://console.cloud.google.com/apis/credentials
47
+# 1. Create an OAuth 2.0 Client ID (Desktop application)
48
+# 2. Enable these APIs: Google Drive API, Google Docs API
49
+GOOGLE_CLIENT_ID=123456789-abc.apps.googleusercontent.com
50
+GOOGLE_CLIENT_SECRET=GOCSPX-...
51
+
52
+# Option B: Service Account (automated/server-side)
53
+# GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
54
+
55
+# --- Zoom (recordings) ------------------------------------------------------
56
+# Create an OAuth app at https://marketplace.zoom.us/develop/create
57
+# App type: "General App" with OAuth
58
+# Scopes: cloud_recording:read:list_user_recordings, cloud_recording:read:recording
59
+ZOOM_CLIENT_ID=...
60
+ZOOM_CLIENT_SECRET=...
61
+# For Server-to-Server (no browser needed):
62
+# ZOOM_ACCOUNT_ID=...
63
+
64
+# --- Microsoft 365 (OneDrive, SharePoint, Teams) ----------------------------
65
+# Register an app at https://portal.azure.com/#view/Microsoft_AAD_RegisteredApps
66
+# API permissions: OnlineMeetings.Read, Files.Read (delegated)
67
+MICROSOFT_CLIENT_ID=...
68
+MICROSOFT_CLIENT_SECRET=...
69
+
70
+# --- Notion ------------------------------------------------------------------
71
+# Option A: OAuth (create integration at https://www.notion.so/my-integrations)
72
+# NOTION_CLIENT_ID=...
73
+# NOTION_CLIENT_SECRET=...
74
+
75
+# Option B: API key (simpler, from the same integrations page)
76
+NOTION_API_KEY=secret_...
77
+
78
+# --- GitHub ------------------------------------------------------------------
79
+# Option A: Personal Access Token (simplest)
80
+# Create at https://github.com/settings/tokens — needs 'repo' scope
81
+GITHUB_TOKEN=ghp_...
82
+
83
+# Option B: OAuth App (for CI/automation)
84
+# GITHUB_CLIENT_ID=...
85
+# GITHUB_CLIENT_SECRET=...
86
+
87
+# --- Dropbox -----------------------------------------------------------------
88
+# Create an app at https://www.dropbox.com/developers/apps
89
+# DROPBOX_APP_KEY=...
90
+# DROPBOX_APP_SECRET=...
91
+# Or use a long-lived access token:
92
+# DROPBOX_ACCESS_TOKEN=...
93
+
94
+# --- General -----------------------------------------------------------------
95
+# CACHE_DIR=~/.cache/planopticon
96
+```
97
+
98
+## Environment variables reference
499
5100
### AI providers
6101
7
-| Variable | Description |
8
-|----------|-------------|
9
-| `OPENAI_API_KEY` | OpenAI API key |
10
-| `ANTHROPIC_API_KEY` | Anthropic API key |
11
-| `GEMINI_API_KEY` | Google Gemini API key |
12
-| `AZURE_OPENAI_API_KEY` | Azure OpenAI API key |
13
-| `AZURE_OPENAI_ENDPOINT` | Azure OpenAI endpoint URL |
14
-| `TOGETHER_API_KEY` | Together AI API key |
15
-| `FIREWORKS_API_KEY` | Fireworks AI API key |
16
-| `CEREBRAS_API_KEY` | Cerebras API key |
17
-| `XAI_API_KEY` | xAI (Grok) API key |
18
-| `OLLAMA_HOST` | Ollama server URL (default: `http://localhost:11434`) |
102
+| Variable | Required | Where to get it |
103
+|----------|----------|----------------|
104
+| `OPENAI_API_KEY` | At least one provider | [platform.openai.com/api-keys](https://platform.openai.com/api-keys) |
105
+| `ANTHROPIC_API_KEY` | At least one provider | [console.anthropic.com](https://console.anthropic.com/settings/keys) |
106
+| `GEMINI_API_KEY` | At least one provider | [aistudio.google.com/apikey](https://aistudio.google.com/apikey) |
107
+| `AZURE_OPENAI_API_KEY` | Optional | Azure portal > your OpenAI resource |
108
+| `AZURE_OPENAI_ENDPOINT` | With Azure | Azure portal > your OpenAI resource |
109
+| `TOGETHER_API_KEY` | Optional | [api.together.xyz](https://api.together.xyz/settings/api-keys) |
110
+| `FIREWORKS_API_KEY` | Optional | [fireworks.ai](https://fireworks.ai/account/api-keys) |
111
+| `CEREBRAS_API_KEY` | Optional | [cloud.cerebras.ai](https://cloud.cerebras.ai/) |
112
+| `XAI_API_KEY` | Optional | [console.x.ai](https://console.x.ai/) |
113
+| `OLLAMA_HOST` | Optional | Default: `http://localhost:11434` |
19114
20115
### Cloud services
21116
22
-| Variable | Description |
23
-|----------|-------------|
24
-| `GOOGLE_APPLICATION_CREDENTIALS` | Path to Google service account JSON (for server-side Drive access) |
25
-| `ZOOM_CLIENT_ID` | Zoom OAuth app client ID |
26
-| `ZOOM_CLIENT_SECRET` | Zoom OAuth app client secret |
27
-| `NOTION_API_KEY` | Notion integration token |
28
-| `GITHUB_TOKEN` | GitHub personal access token |
29
-| `MICROSOFT_CLIENT_ID` | Azure AD app client ID (for Microsoft 365) |
30
-| `MICROSOFT_CLIENT_SECRET` | Azure AD app client secret |
117
+| Variable | Service | Auth method |
118
+|----------|---------|-------------|
119
+| `GOOGLE_CLIENT_ID` | Google (Drive, Docs, Meet) | OAuth |
120
+| `GOOGLE_CLIENT_SECRET` | Google | OAuth |
121
+| `GOOGLE_APPLICATION_CREDENTIALS` | Google | Service account |
122
+| `ZOOM_CLIENT_ID` | Zoom | OAuth |
123
+| `ZOOM_CLIENT_SECRET` | Zoom | OAuth |
124
+| `ZOOM_ACCOUNT_ID` | Zoom | Server-to-Server |
125
+| `MICROSOFT_CLIENT_ID` | Microsoft 365 | OAuth |
126
+| `MICROSOFT_CLIENT_SECRET` | Microsoft 365 | OAuth |
127
+| `NOTION_CLIENT_ID` | Notion | OAuth |
128
+| `NOTION_CLIENT_SECRET` | Notion | OAuth |
129
+| `NOTION_API_KEY` | Notion | API key |
130
+| `GITHUB_CLIENT_ID` | GitHub | OAuth |
131
+| `GITHUB_CLIENT_SECRET` | GitHub | OAuth |
132
+| `GITHUB_TOKEN` | GitHub | API key |
133
+| `DROPBOX_APP_KEY` | Dropbox | OAuth |
134
+| `DROPBOX_APP_SECRET` | Dropbox | OAuth |
135
+| `DROPBOX_ACCESS_TOKEN` | Dropbox | API key |
31136
32137
### General
33138
34139
| Variable | Description |
35140
|----------|-------------|
36141
| `CACHE_DIR` | Directory for API response caching |
37142
38143
## Authentication
39144
40
-Most cloud services use OAuth via the `planopticon auth` command. Run it once per service to store credentials locally:
145
+PlanOpticon uses OAuth for cloud services. Run `planopticon auth` once per service — tokens are saved locally and refreshed automatically.
41146
42147
```bash
43148
planopticon auth google # Google Drive, Docs, Meet, YouTube
44149
planopticon auth dropbox # Dropbox
45150
planopticon auth zoom # Zoom recordings
@@ -46,13 +151,24 @@
46151
planopticon auth notion # Notion pages
47152
planopticon auth github # GitHub repos and wikis
48153
planopticon auth microsoft # OneDrive, SharePoint, Teams
49154
```
50155
51
-Credentials are stored in `~/.config/planopticon/`. Use `planopticon auth SERVICE --logout` to remove them.
156
+Credentials are stored in `~/.planopticon/`. Use `planopticon auth SERVICE --logout` to remove them.
157
+
158
+### What each service needs
159
+
160
+| Service | Minimum setup | Full OAuth setup |
161
+|---------|--------------|-----------------|
162
+| Google | `GOOGLE_CLIENT_ID` + `GOOGLE_CLIENT_SECRET` | Create OAuth credentials in [Google Cloud Console](https://console.cloud.google.com/apis/credentials) |
163
+| Zoom | `ZOOM_CLIENT_ID` + `ZOOM_CLIENT_SECRET` | Create a General App at [marketplace.zoom.us](https://marketplace.zoom.us/develop/create) |
164
+| Microsoft | `MICROSOFT_CLIENT_ID` + `MICROSOFT_CLIENT_SECRET` | Register app in [Azure AD](https://portal.azure.com/#view/Microsoft_AAD_RegisteredApps) |
165
+| Notion | `NOTION_API_KEY` (simplest) | Create integration at [notion.so/my-integrations](https://www.notion.so/my-integrations) |
166
+| GitHub | `GITHUB_TOKEN` (simplest) | Create token at [github.com/settings/tokens](https://github.com/settings/tokens) |
167
+| Dropbox | `DROPBOX_APP_KEY` + `DROPBOX_APP_SECRET` | Create app at [dropbox.com/developers](https://www.dropbox.com/developers/apps) |
52168
53
-For Zoom and Microsoft 365, you also need to set the client ID and secret environment variables before running `planopticon auth`.
169
+For detailed OAuth app creation walkthroughs, see the [Authentication guide](../guide/authentication.md).
54170
55171
## Provider routing
56172
57173
PlanOpticon auto-discovers available models and routes each task to the cheapest capable option:
58174
59175
60176
ADDED docs/guide/authentication.md
--- docs/getting-started/configuration.md
+++ docs/getting-started/configuration.md
@@ -1,45 +1,150 @@
1 # Configuration
2
3 ## Environment variables
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
5 ### AI providers
6
7 | Variable | Description |
8 |----------|-------------|
9 | `OPENAI_API_KEY` | OpenAI API key |
10 | `ANTHROPIC_API_KEY` | Anthropic API key |
11 | `GEMINI_API_KEY` | Google Gemini API key |
12 | `AZURE_OPENAI_API_KEY` | Azure OpenAI API key |
13 | `AZURE_OPENAI_ENDPOINT` | Azure OpenAI endpoint URL |
14 | `TOGETHER_API_KEY` | Together AI API key |
15 | `FIREWORKS_API_KEY` | Fireworks AI API key |
16 | `CEREBRAS_API_KEY` | Cerebras API key |
17 | `XAI_API_KEY` | xAI (Grok) API key |
18 | `OLLAMA_HOST` | Ollama server URL (default: `http://localhost:11434`) |
19
20 ### Cloud services
21
22 | Variable | Description |
23 |----------|-------------|
24 | `GOOGLE_APPLICATION_CREDENTIALS` | Path to Google service account JSON (for server-side Drive access) |
25 | `ZOOM_CLIENT_ID` | Zoom OAuth app client ID |
26 | `ZOOM_CLIENT_SECRET` | Zoom OAuth app client secret |
27 | `NOTION_API_KEY` | Notion integration token |
28 | `GITHUB_TOKEN` | GitHub personal access token |
29 | `MICROSOFT_CLIENT_ID` | Azure AD app client ID (for Microsoft 365) |
30 | `MICROSOFT_CLIENT_SECRET` | Azure AD app client secret |
 
 
 
 
 
 
 
 
 
 
31
32 ### General
33
34 | Variable | Description |
35 |----------|-------------|
36 | `CACHE_DIR` | Directory for API response caching |
37
38 ## Authentication
39
40 Most cloud services use OAuth via the `planopticon auth` command. Run it once per service to store credentials locally:
41
42 ```bash
43 planopticon auth google # Google Drive, Docs, Meet, YouTube
44 planopticon auth dropbox # Dropbox
45 planopticon auth zoom # Zoom recordings
@@ -46,13 +151,24 @@
46 planopticon auth notion # Notion pages
47 planopticon auth github # GitHub repos and wikis
48 planopticon auth microsoft # OneDrive, SharePoint, Teams
49 ```
50
51 Credentials are stored in `~/.config/planopticon/`. Use `planopticon auth SERVICE --logout` to remove them.
 
 
 
 
 
 
 
 
 
 
 
52
53 For Zoom and Microsoft 365, you also need to set the client ID and secret environment variables before running `planopticon auth`.
54
55 ## Provider routing
56
57 PlanOpticon auto-discovers available models and routes each task to the cheapest capable option:
58
59
60 DDED docs/guide/authentication.md
--- docs/getting-started/configuration.md
+++ docs/getting-started/configuration.md
@@ -1,45 +1,150 @@
1 # Configuration
2
3 ## Example `.env` file
4
5 Create a `.env` file in your project directory. PlanOpticon loads it automatically.
6
7 ```bash
8 # =============================================================================
9 # PlanOpticon Configuration
10 # =============================================================================
11 # Copy this file to .env and fill in the values you need.
12 # You only need ONE AI provider — PlanOpticon auto-detects which are available.
13
14 # --- AI Providers (set at least one) ----------------------------------------
15
16 # OpenAI — get your key at https://platform.openai.com/api-keys
17 OPENAI_API_KEY=sk-...
18
19 # Anthropic — get your key at https://console.anthropic.com/settings/keys
20 ANTHROPIC_API_KEY=sk-ant-...
21
22 # Google Gemini — get your key at https://aistudio.google.com/apikey
23 GEMINI_API_KEY=AI...
24
25 # Azure OpenAI — from your Azure portal deployment
26 # AZURE_OPENAI_API_KEY=...
27 # AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
28
29 # Together AI — https://api.together.xyz/settings/api-keys
30 # TOGETHER_API_KEY=...
31
32 # Fireworks AI — https://fireworks.ai/account/api-keys
33 # FIREWORKS_API_KEY=...
34
35 # Cerebras — https://cloud.cerebras.ai/
36 # CEREBRAS_API_KEY=...
37
38 # xAI (Grok) — https://console.x.ai/
39 # XAI_API_KEY=...
40
41 # Ollama (local, no key needed) — just run: ollama serve
42 # OLLAMA_HOST=http://localhost:11434
43
44 # --- Google (Drive, Docs, Sheets, Meet, YouTube) ----------------------------
45 # Option A: OAuth (interactive, recommended for personal use)
46 # Create credentials at https://console.cloud.google.com/apis/credentials
47 # 1. Create an OAuth 2.0 Client ID (Desktop application)
48 # 2. Enable these APIs: Google Drive API, Google Docs API
49 GOOGLE_CLIENT_ID=123456789-abc.apps.googleusercontent.com
50 GOOGLE_CLIENT_SECRET=GOCSPX-...
51
52 # Option B: Service Account (automated/server-side)
53 # GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
54
55 # --- Zoom (recordings) ------------------------------------------------------
56 # Create an OAuth app at https://marketplace.zoom.us/develop/create
57 # App type: "General App" with OAuth
58 # Scopes: cloud_recording:read:list_user_recordings, cloud_recording:read:recording
59 ZOOM_CLIENT_ID=...
60 ZOOM_CLIENT_SECRET=...
61 # For Server-to-Server (no browser needed):
62 # ZOOM_ACCOUNT_ID=...
63
64 # --- Microsoft 365 (OneDrive, SharePoint, Teams) ----------------------------
65 # Register an app at https://portal.azure.com/#view/Microsoft_AAD_RegisteredApps
66 # API permissions: OnlineMeetings.Read, Files.Read (delegated)
67 MICROSOFT_CLIENT_ID=...
68 MICROSOFT_CLIENT_SECRET=...
69
70 # --- Notion ------------------------------------------------------------------
71 # Option A: OAuth (create integration at https://www.notion.so/my-integrations)
72 # NOTION_CLIENT_ID=...
73 # NOTION_CLIENT_SECRET=...
74
75 # Option B: API key (simpler, from the same integrations page)
76 NOTION_API_KEY=secret_...
77
78 # --- GitHub ------------------------------------------------------------------
79 # Option A: Personal Access Token (simplest)
80 # Create at https://github.com/settings/tokens — needs 'repo' scope
81 GITHUB_TOKEN=ghp_...
82
83 # Option B: OAuth App (for CI/automation)
84 # GITHUB_CLIENT_ID=...
85 # GITHUB_CLIENT_SECRET=...
86
87 # --- Dropbox -----------------------------------------------------------------
88 # Create an app at https://www.dropbox.com/developers/apps
89 # DROPBOX_APP_KEY=...
90 # DROPBOX_APP_SECRET=...
91 # Or use a long-lived access token:
92 # DROPBOX_ACCESS_TOKEN=...
93
94 # --- General -----------------------------------------------------------------
95 # CACHE_DIR=~/.cache/planopticon
96 ```
97
98 ## Environment variables reference
99
100 ### AI providers
101
102 | Variable | Required | Where to get it |
103 |----------|----------|----------------|
104 | `OPENAI_API_KEY` | At least one provider | [platform.openai.com/api-keys](https://platform.openai.com/api-keys) |
105 | `ANTHROPIC_API_KEY` | At least one provider | [console.anthropic.com](https://console.anthropic.com/settings/keys) |
106 | `GEMINI_API_KEY` | At least one provider | [aistudio.google.com/apikey](https://aistudio.google.com/apikey) |
107 | `AZURE_OPENAI_API_KEY` | Optional | Azure portal > your OpenAI resource |
108 | `AZURE_OPENAI_ENDPOINT` | With Azure | Azure portal > your OpenAI resource |
109 | `TOGETHER_API_KEY` | Optional | [api.together.xyz](https://api.together.xyz/settings/api-keys) |
110 | `FIREWORKS_API_KEY` | Optional | [fireworks.ai](https://fireworks.ai/account/api-keys) |
111 | `CEREBRAS_API_KEY` | Optional | [cloud.cerebras.ai](https://cloud.cerebras.ai/) |
112 | `XAI_API_KEY` | Optional | [console.x.ai](https://console.x.ai/) |
113 | `OLLAMA_HOST` | Optional | Default: `http://localhost:11434` |
114
115 ### Cloud services
116
117 | Variable | Service | Auth method |
118 |----------|---------|-------------|
119 | `GOOGLE_CLIENT_ID` | Google (Drive, Docs, Meet) | OAuth |
120 | `GOOGLE_CLIENT_SECRET` | Google | OAuth |
121 | `GOOGLE_APPLICATION_CREDENTIALS` | Google | Service account |
122 | `ZOOM_CLIENT_ID` | Zoom | OAuth |
123 | `ZOOM_CLIENT_SECRET` | Zoom | OAuth |
124 | `ZOOM_ACCOUNT_ID` | Zoom | Server-to-Server |
125 | `MICROSOFT_CLIENT_ID` | Microsoft 365 | OAuth |
126 | `MICROSOFT_CLIENT_SECRET` | Microsoft 365 | OAuth |
127 | `NOTION_CLIENT_ID` | Notion | OAuth |
128 | `NOTION_CLIENT_SECRET` | Notion | OAuth |
129 | `NOTION_API_KEY` | Notion | API key |
130 | `GITHUB_CLIENT_ID` | GitHub | OAuth |
131 | `GITHUB_CLIENT_SECRET` | GitHub | OAuth |
132 | `GITHUB_TOKEN` | GitHub | API key |
133 | `DROPBOX_APP_KEY` | Dropbox | OAuth |
134 | `DROPBOX_APP_SECRET` | Dropbox | OAuth |
135 | `DROPBOX_ACCESS_TOKEN` | Dropbox | API key |
136
137 ### General
138
139 | Variable | Description |
140 |----------|-------------|
141 | `CACHE_DIR` | Directory for API response caching |
142
143 ## Authentication
144
145 PlanOpticon uses OAuth for cloud services. Run `planopticon auth` once per service — tokens are saved locally and refreshed automatically.
146
147 ```bash
148 planopticon auth google # Google Drive, Docs, Meet, YouTube
149 planopticon auth dropbox # Dropbox
150 planopticon auth zoom # Zoom recordings
@@ -46,13 +151,24 @@
151 planopticon auth notion # Notion pages
152 planopticon auth github # GitHub repos and wikis
153 planopticon auth microsoft # OneDrive, SharePoint, Teams
154 ```
155
156 Credentials are stored in `~/.planopticon/`. Use `planopticon auth SERVICE --logout` to remove them.
157
158 ### What each service needs
159
160 | Service | Minimum setup | Full OAuth setup |
161 |---------|--------------|-----------------|
162 | Google | `GOOGLE_CLIENT_ID` + `GOOGLE_CLIENT_SECRET` | Create OAuth credentials in [Google Cloud Console](https://console.cloud.google.com/apis/credentials) |
163 | Zoom | `ZOOM_CLIENT_ID` + `ZOOM_CLIENT_SECRET` | Create a General App at [marketplace.zoom.us](https://marketplace.zoom.us/develop/create) |
164 | Microsoft | `MICROSOFT_CLIENT_ID` + `MICROSOFT_CLIENT_SECRET` | Register app in [Azure AD](https://portal.azure.com/#view/Microsoft_AAD_RegisteredApps) |
165 | Notion | `NOTION_API_KEY` (simplest) | Create integration at [notion.so/my-integrations](https://www.notion.so/my-integrations) |
166 | GitHub | `GITHUB_TOKEN` (simplest) | Create token at [github.com/settings/tokens](https://github.com/settings/tokens) |
167 | Dropbox | `DROPBOX_APP_KEY` + `DROPBOX_APP_SECRET` | Create app at [dropbox.com/developers](https://www.dropbox.com/developers/apps) |
168
169 For detailed OAuth app creation walkthroughs, see the [Authentication guide](../guide/authentication.md).
170
171 ## Provider routing
172
173 PlanOpticon auto-discovers available models and routes each task to the cheapest capable option:
174
175
176 DDED docs/guide/authentication.md
--- a/docs/guide/authentication.md
+++ b/docs/guide/authentication.md
@@ -0,0 +1,525 @@
1
+# Authentication
2
+
3
+PlanOpticon uses a unified authentication system to connect with cloud services for fetching recordings, documents, and other content. The system is **OAuth-first**: it prefers OAuth 2.0 flows for security and token management, but falls back to API keys when OAuth is not configured.
4
+
5
+## Auth strategy overview
6
+
7
+PlanOpticon supports six cloud services out of the box: Google, Dropbox, Zoom, Notion, GitHub, and Microsoft. Each service uses the same authentication chain, implemented through the `OAuthManager` class. You configure credentials once (via environment variables or directly), and PlanOpticon handles token acquisition, storage, refresh, and fallback automatically.
8
+
9
+All authentication state is managed through the `planopticon auth` CLI command, the `/auth` companion REPL command, or programmatically via the Python API.
10
+
11
+## The auth chain
12
+
13
+When you authenticate with a service, PlanOpticon tries the following methods in order. It stops at the first one that succeeds:
14
+
15
+1. **Saved token** -- Checks `~/.planopticon/{service}_token.json` for a previously saved token. If the token has not expired, it is used immediately. If it has expired but a refresh token is available, PlanOpticon attempts an automatic token refresh.
16
+
17
+2. **Client Credentials grant** (Server-to-Server) -- If an `account_id` is configured (e.g., `ZOOM_ACCOUNT_ID`), PlanOpticon attempts a client credentials grant. This is a non-interactive flow suitable for automated pipelines and server-side integrations. No browser is required.
18
+
19
+3. **OAuth 2.0 Authorization Code with PKCE** (interactive) -- If a client ID is configured and OAuth endpoints are available, PlanOpticon initiates an interactive OAuth PKCE flow. It opens a browser to the service's authorization page, waits for you to paste the authorization code, and exchanges it for tokens. The tokens are saved for future use.
20
+
21
+4. **API key fallback** -- If no OAuth method succeeds, PlanOpticon checks for a service-specific API key environment variable (e.g., `GITHUB_TOKEN`, `NOTION_API_KEY`). This is the simplest setup but may have reduced capabilities compared to OAuth.
22
+
23
+If none of the four methods succeed, PlanOpticon returns an error with hints about which environment variables to set.
24
+
25
+## Token storage
26
+
27
+Tokens are persisted as JSON files in `~/.planopticon/`:
28
+
29
+```
30
+~/.planopticon/
31
+ google_token.json
32
+ dropbox_token.json
33
+ zoom_token.json
34
+ notion_token.json
35
+ github_token.json
36
+ microsoft_token.json
37
+```
38
+
39
+Each token file contains:
40
+
41
+| Field | Description |
42
+|-------|-------------|
43
+| `access_token` | The current access token |
44
+| `refresh_token` | Refresh token for automatic renewal (if provided by the service) |
45
+| `expires_at` | Unix timestamp when the token expires (with a 60-second safety margin) |
46
+| `client_id` | The client ID used for this token (for refresh) |
47
+| `client_secret` | The client secret used (for refresh) |
48
+
49
+The `~/.planopticon/` directory is created automatically on first use. Token files are overwritten on each successful authentication or refresh.
50
+
51
+To remove a saved token, use `planopticon auth <service> --logout` or delete the file directly.
52
+
53
+## Supported services
54
+
55
+### Google
56
+
57
+Google authentication provides access to Google Drive and Google Docs for fetching documents, recordings, and other content.
58
+
59
+**Scopes requested:**
60
+
61
+- `https://www.googleapis.com/auth/drive.readonly`
62
+- `https://www.googleapis.com/auth/documents.readonly`
63
+
64
+**Environment variables:**
65
+
66
+| Variable | Required | Description |
67
+|----------|----------|-------------|
68
+| `GOOGLE_CLIENT_ID` | For OAuth | OAuth 2.0 Client ID from Google Cloud Console |
69
+| `GOOGLE_CLIENT_SECRET` | For OAuth | OAuth 2.0 Client Secret |
70
+| `GOOGLE_API_KEY` | Fallback | API key (limited access, no user-specific data) |
71
+
72
+**OAuth app setup:**
73
+
74
+1. Go to the [Google Cloud Console](https://console.cloud.google.com/).
75
+2. Create a project (or select an existing one).
76
+3. Navigate to **APIs & Services > Credentials**.
77
+4. Click **Create Credentials > OAuth client ID**.
78
+5. Choose **Desktop app** as the application type.
79
+6. Copy the Client ID and Client Secret.
80
+7. Under **APIs & Services > Library**, enable the **Google Drive API** and **Google Docs API**.
81
+8. Set the environment variables:
82
+
83
+```bash
84
+export GOOGLE_CLIENT_ID="your-client-id.apps.googleusercontent.com"
85
+export GOOGLE_CLIENT_SECRET="your-client-secret"
86
+```
87
+
88
+**Service account fallback:** For automated pipelines, you can use a Google service account instead of OAuth. Generate a service account key JSON file from the Google Cloud Console and set `GOOGLE_APPLICATION_CREDENTIALS` to point to it. The PlanOpticon Google Workspace connector (`planopticon gws`) uses the `gws` CLI which has its own auth flow via `gws auth login`.
89
+
90
+### Dropbox
91
+
92
+Dropbox authentication provides access to files stored in Dropbox.
93
+
94
+**Environment variables:**
95
+
96
+| Variable | Required | Description |
97
+|----------|----------|-------------|
98
+| `DROPBOX_APP_KEY` | For OAuth | App key from the Dropbox App Console |
99
+| `DROPBOX_APP_SECRET` | For OAuth | App secret |
100
+| `DROPBOX_ACCESS_TOKEN` | Fallback | Long-lived access token (for quick setup) |
101
+
102
+**OAuth app setup:**
103
+
104
+1. Go to the [Dropbox App Console](https://www.dropbox.com/developers/apps).
105
+2. Click **Create App**.
106
+3. Choose **Scoped access** and **Full Dropbox** (or **App folder** for restricted access).
107
+4. Copy the App key and App secret from the **Settings** tab.
108
+5. Set the environment variables:
109
+
110
+```bash
111
+export DROPBOX_APP_KEY="your-app-key"
112
+export DROPBOX_APP_SECRET="your-app-secret"
113
+```
114
+
115
+**Access token shortcut:** For quick testing, you can generate an access token directly from the app's Settings page in the Dropbox App Console and set it as `DROPBOX_ACCESS_TOKEN`. This bypasses OAuth entirely but the token may have a limited lifetime.
116
+
117
+### Zoom
118
+
119
+Zoom authentication provides access to cloud recordings, meeting metadata, and transcripts.
120
+
121
+**Environment variables:**
122
+
123
+| Variable | Required | Description |
124
+|----------|----------|-------------|
125
+| `ZOOM_CLIENT_ID` | For OAuth | OAuth client ID from the Zoom Marketplace |
126
+| `ZOOM_CLIENT_SECRET` | For OAuth | OAuth client secret |
127
+| `ZOOM_ACCOUNT_ID` | For S2S | Account ID for Server-to-Server OAuth |
128
+
129
+**Server-to-Server (recommended for automation):**
130
+
131
+When `ZOOM_ACCOUNT_ID` is set alongside `ZOOM_CLIENT_ID` and `ZOOM_CLIENT_SECRET`, PlanOpticon uses the client credentials grant (Server-to-Server OAuth). This is non-interactive and ideal for CI/CD pipelines and scheduled jobs.
132
+
133
+1. Go to the [Zoom Marketplace](https://marketplace.zoom.us/).
134
+2. Click **Develop > Build App**.
135
+3. Choose **Server-to-Server OAuth**.
136
+4. Copy the Account ID, Client ID, and Client Secret.
137
+5. Add the required scopes: `recording:read:admin` (or `recording:read`).
138
+6. Set the environment variables:
139
+
140
+```bash
141
+export ZOOM_CLIENT_ID="your-client-id"
142
+export ZOOM_CLIENT_SECRET="your-client-secret"
143
+export ZOOM_ACCOUNT_ID="your-account-id"
144
+```
145
+
146
+**User-level OAuth PKCE:**
147
+
148
+If `ZOOM_ACCOUNT_ID` is not set, PlanOpticon falls back to the interactive OAuth PKCE flow. This opens a browser window for the user to authorize access.
149
+
150
+1. In the Zoom Marketplace, create a **General App** (or **OAuth** app).
151
+2. Set the redirect URI to `urn:ietf:wg:oauth:2.0:oob` (out-of-band).
152
+3. Copy the Client ID and Client Secret.
153
+
154
+### Notion
155
+
156
+Notion authentication provides access to pages, databases, and content in your Notion workspace.
157
+
158
+**Environment variables:**
159
+
160
+| Variable | Required | Description |
161
+|----------|----------|-------------|
162
+| `NOTION_CLIENT_ID` | For OAuth | OAuth client ID from the Notion Integrations page |
163
+| `NOTION_CLIENT_SECRET` | For OAuth | OAuth client secret |
164
+| `NOTION_API_KEY` | Fallback | Internal integration token |
165
+
166
+**OAuth app setup:**
167
+
168
+1. Go to [My Integrations](https://www.notion.so/my-integrations) in Notion.
169
+2. Click **New integration**.
170
+3. Select **Public integration** (required for OAuth).
171
+4. Copy the OAuth Client ID and Client Secret.
172
+5. Set the redirect URI.
173
+6. Set the environment variables:
174
+
175
+```bash
176
+export NOTION_CLIENT_ID="your-client-id"
177
+export NOTION_CLIENT_SECRET="your-client-secret"
178
+```
179
+
180
+**Internal integration (API key fallback):**
181
+
182
+For simpler setups, create an **Internal integration** from the Notion Integrations page. Copy the integration token and set it as `NOTION_API_KEY`. You must also share the relevant Notion pages/databases with the integration.
183
+
184
+```bash
185
+export NOTION_API_KEY="ntn_your-integration-token"
186
+```
187
+
188
+### GitHub
189
+
190
+GitHub authentication provides access to repositories, issues, and organization data.
191
+
192
+**Scopes requested:**
193
+
194
+- `repo`
195
+- `read:org`
196
+
197
+**Environment variables:**
198
+
199
+| Variable | Required | Description |
200
+|----------|----------|-------------|
201
+| `GITHUB_CLIENT_ID` | For OAuth | OAuth App client ID |
202
+| `GITHUB_CLIENT_SECRET` | For OAuth | OAuth App client secret |
203
+| `GITHUB_TOKEN` | Fallback | Personal access token (classic or fine-grained) |
204
+
205
+**OAuth app setup:**
206
+
207
+1. Go to **GitHub > Settings > Developer Settings > OAuth Apps**.
208
+2. Click **New OAuth App**.
209
+3. Set the Authorization callback URL to `urn:ietf:wg:oauth:2.0:oob`.
210
+4. Copy the Client ID and generate a Client Secret.
211
+5. Set the environment variables:
212
+
213
+```bash
214
+export GITHUB_CLIENT_ID="your-client-id"
215
+export GITHUB_CLIENT_SECRET="your-client-secret"
216
+```
217
+
218
+**Personal access token (recommended for most users):**
219
+
220
+The simplest approach is to create a Personal Access Token:
221
+
222
+1. Go to **GitHub > Settings > Developer Settings > Personal Access Tokens**.
223
+2. Generate a token with `repo` and `read:org` scopes.
224
+3. Set it as `GITHUB_TOKEN`:
225
+
226
+```bash
227
+export GITHUB_TOKEN="ghp_your-token"
228
+```
229
+
230
+### Microsoft
231
+
232
+Microsoft authentication provides access to Microsoft 365 resources via the Microsoft Graph API, including OneDrive, SharePoint, and Teams recordings.
233
+
234
+**Scopes requested:**
235
+
236
+- `https://graph.microsoft.com/OnlineMeetings.Read`
237
+- `https://graph.microsoft.com/Files.Read`
238
+
239
+**Environment variables:**
240
+
241
+| Variable | Required | Description |
242
+|----------|----------|-------------|
243
+| `MICROSOFT_CLIENT_ID` | For OAuth | Application (client) ID from Azure AD |
244
+| `MICROSOFT_CLIENT_SECRET` | For OAuth | Client secret from Azure AD |
245
+
246
+**Azure AD app registration:**
247
+
248
+1. Go to the [Azure Portal](https://portal.azure.com/).
249
+2. Navigate to **Azure Active Directory > App registrations**.
250
+3. Click **New registration**.
251
+4. Name the application (e.g., "PlanOpticon").
252
+5. Under **Supported account types**, select the appropriate option for your organization.
253
+6. Set the redirect URI to `urn:ietf:wg:oauth:2.0:oob` with platform **Mobile and desktop applications**.
254
+7. After registration, go to **Certificates & secrets** and create a new client secret.
255
+8. Under **API permissions**, add:
256
+ - `OnlineMeetings.Read`
257
+ - `Files.Read`
258
+9. Grant admin consent if required by your organization.
259
+10. Set the environment variables:
260
+
261
+```bash
262
+export MICROSOFT_CLIENT_ID="your-application-id"
263
+export MICROSOFT_CLIENT_SECRET="your-client-secret"
264
+```
265
+
266
+**Microsoft 365 CLI:** The `planopticon m365` commands use the `@pnp/cli-microsoft365` npm package, which has its own authentication flow via `m365 login`. This is separate from the OAuth flow described above.
267
+
268
+## CLI usage
269
+
270
+### `planopticon auth`
271
+
272
+Authenticate with a cloud service or manage saved tokens.
273
+
274
+```
275
+planopticon auth SERVICE [--logout]
276
+```
277
+
278
+**Arguments:**
279
+
280
+| Argument | Description |
281
+|----------|-------------|
282
+| `SERVICE` | One of: `google`, `dropbox`, `zoom`, `notion`, `github`, `microsoft` |
283
+
284
+**Options:**
285
+
286
+| Option | Description |
287
+|--------|-------------|
288
+| `--logout` | Clear the saved token for the specified service |
289
+
290
+**Examples:**
291
+
292
+```bash
293
+# Authenticate with Google (triggers OAuth flow or uses saved token)
294
+planopticon auth google
295
+
296
+# Authenticate with Zoom
297
+planopticon auth zoom
298
+
299
+# Clear saved GitHub token
300
+planopticon auth github --logout
301
+```
302
+
303
+On success, the command prints the authentication method used:
304
+
305
+```
306
+Google authentication successful (oauth_pkce).
307
+```
308
+
309
+or
310
+
311
+```
312
+Github authentication successful (api_key).
313
+```
314
+
315
+### Companion REPL `/auth`
316
+
317
+Inside the interactive companion REPL (`planopticon -C` or `planopticon -I`), you can authenticate with services using the `/auth` command:
318
+
319
+```
320
+/auth SERVICE
321
+```
322
+
323
+Without arguments, `/auth` lists all available services:
324
+
325
+```
326
+> /auth
327
+Usage: /auth SERVICE
328
+Available: dropbox, github, google, microsoft, notion, zoom
329
+```
330
+
331
+With a service name, it runs the same auth chain as the CLI command:
332
+
333
+```
334
+> /auth github
335
+Github authentication successful (api_key).
336
+```
337
+
338
+## Environment variables reference
339
+
340
+The following table summarizes all environment variables used by the authentication system:
341
+
342
+| Service | OAuth Client ID | OAuth Client Secret | API Key / Token | Account ID |
343
+|---------|----------------|--------------------|--------------------|------------|
344
+| Google | `GOOGLE_CLIENT_ID` | `GOOGLE_CLIENT_SECRET` | `GOOGLE_API_KEY` | -- |
345
+| Dropbox | `DROPBOX_APP_KEY` | `DROPBOX_APP_SECRET` | `DROPBOX_ACCESS_TOKEN` | -- |
346
+| Zoom | `ZOOM_CLIENT_ID` | `ZOOM_CLIENT_SECRET` | -- | `ZOOM_ACCOUNT_ID` |
347
+| Notion | `NOTION_CLIENT_ID` | `NOTION_CLIENT_SECRET` | `NOTION_API_KEY` | -- |
348
+| GitHub | `GITHUB_CLIENT_ID` | `GITHUB_CLIENT_SECRET` | `GITHUB_TOKEN` | -- |
349
+| Microsoft | `MICROSOFT_CLIENT_ID` | `MICROSOFT_CLIENT_SECRET` | -- | -- |
350
+
351
+## Python API
352
+
353
+### AuthConfig
354
+
355
+The `AuthConfig` dataclass defines the authentication configuration for a service. It holds OAuth endpoints, credential references, scopes, and token storage paths.
356
+
357
+```python
358
+from video_processor.auth import AuthConfig
359
+
360
+config = AuthConfig(
361
+ service="myservice",
362
+ oauth_authorize_url="https://example.com/oauth/authorize",
363
+ oauth_token_url="https://example.com/oauth/token",
364
+ client_id_env="MYSERVICE_CLIENT_ID",
365
+ client_secret_env="MYSERVICE_CLIENT_SECRET",
366
+ api_key_env="MYSERVICE_API_KEY",
367
+ scopes=["read", "write"],
368
+)
369
+```
370
+
371
+**Key fields:**
372
+
373
+| Field | Type | Description |
374
+|-------|------|-------------|
375
+| `service` | `str` | Service identifier (used for token filename) |
376
+| `oauth_authorize_url` | `Optional[str]` | OAuth authorization endpoint |
377
+| `oauth_token_url` | `Optional[str]` | OAuth token endpoint |
378
+| `client_id` / `client_id_env` | `Optional[str]` | Client ID value or env var name |
379
+| `client_secret` / `client_secret_env` | `Optional[str]` | Client secret value or env var name |
380
+| `api_key_env` | `Optional[str]` | Environment variable for API key fallback |
381
+| `scopes` | `List[str]` | OAuth scopes to request |
382
+| `redirect_uri` | `str` | Redirect URI (default: `urn:ietf:wg:oauth:2.0:oob`) |
383
+| `account_id` / `account_id_env` | `Optional[str]` | Account ID for client credentials grant |
384
+| `token_path` | `Optional[Path]` | Override token storage path |
385
+
386
+**Resolved properties:**
387
+
388
+- `resolved_client_id` -- Returns the client ID from the direct value or environment variable.
389
+- `resolved_client_secret` -- Returns the client secret from the direct value or environment variable.
390
+- `resolved_api_key` -- Returns the API key from the environment variable.
391
+- `resolved_account_id` -- Returns the account ID from the direct value or environment variable.
392
+- `resolved_token_path` -- Returns the token file path (default: `~/.planopticon/{service}_token.json`).
393
+- `supports_oauth` -- Returns `True` if both OAuth endpoints are configured.
394
+
395
+### OAuthManager
396
+
397
+The `OAuthManager` class manages the full authentication lifecycle for a service.
398
+
399
+```python
400
+from video_processor.auth import OAuthManager, AuthConfig
401
+
402
+config = AuthConfig(
403
+ service="notion",
404
+ oauth_authorize_url="https://api.notion.com/v1/oauth/authorize",
405
+ oauth_token_url="https://api.notion.com/v1/oauth/token",
406
+ client_id_env="NOTION_CLIENT_ID",
407
+ client_secret_env="NOTION_CLIENT_SECRET",
408
+ api_key_env="NOTION_API_KEY",
409
+ scopes=["read_content"],
410
+)
411
+manager = OAuthManager(config)
412
+
413
+# Full auth chain -- returns AuthResult
414
+result = manager.authenticate()
415
+if result.success:
416
+ print(f"Authenticated via {result.method}")
417
+ print(f"Token: {result.access_token[:20]}...")
418
+
419
+# Convenience method -- returns just the token string or None
420
+token = manager.get_token()
421
+
422
+# Clear saved token (logout)
423
+manager.clear_token()
424
+```
425
+
426
+**AuthResult fields:**
427
+
428
+| Field | Type | Description |
429
+|-------|------|-------------|
430
+| `success` | `bool` | Whether authentication succeeded |
431
+| `access_token` | `Optional[str]` | The access token (if successful) |
432
+| `method` | `Optional[str]` | One of: `saved_token`, `oauth_pkce`, `client_credentials`, `api_key` |
433
+| `expires_at` | `Optional[float]` | Token expiry as a Unix timestamp |
434
+| `refresh_token` | `Optional[str]` | Refresh token (if provided) |
435
+| `error` | `Optional[str]` | Error message (if unsuccessful) |
436
+
437
+### Pre-built configs
438
+
439
+PlanOpticon ships with pre-built `AuthConfig` instances for all six supported services. Access them via convenience functions:
440
+
441
+```python
442
+from video_processor.auth import get_auth_config, get_auth_manager
443
+
444
+# Get just the config
445
+config = get_auth_config("zoom")
446
+
447
+# Get a ready-to-use manager
448
+manager = get_auth_manager("github")
449
+token = manager.get_token()
450
+```
451
+
452
+### Building custom connectors
453
+
454
+To add authentication for a new service, create an `AuthConfig` with the service's OAuth endpoints and credential environment variables:
455
+
456
+```python
457
+from video_processor.auth import AuthConfig, OAuthManager
458
+
459
+config = AuthConfig(
460
+ service="slack",
461
+ oauth_authorize_url="https://slack.com/oauth/v2/authorize",
462
+ oauth_token_url="https://slack.com/api/oauth.v2.access",
463
+ client_id_env="SLACK_CLIENT_ID",
464
+ client_secret_env="SLACK_CLIENT_SECRET",
465
+ api_key_env="SLACK_BOT_TOKEN",
466
+ scopes=["channels:read", "channels:history"],
467
+)
468
+
469
+manager = OAuthManager(config)
470
+result = manager.authenticate()
471
+```
472
+
473
+The token will be saved to `~/.planopticon/slack_token.json` and automatically refreshed on subsequent calls.
474
+
475
+## Troubleshooting
476
+
477
+### "No auth method available for {service}"
478
+
479
+This means none of the four auth methods succeeded. Check that:
480
+
481
+- The required environment variables are set and non-empty.
482
+- For OAuth: both the client ID and client secret (or app key/secret) are set.
483
+- For API key fallback: the correct environment variable is set.
484
+
485
+The error message includes hints about which variables to set.
486
+
487
+### Token refresh fails
488
+
489
+If automatic token refresh fails, PlanOpticon falls back to the next auth method in the chain. Common causes:
490
+
491
+- The refresh token has been revoked (e.g., you changed your password or revoked app access).
492
+- The OAuth app's client secret has changed.
493
+- The service requires re-authorization after a certain period.
494
+
495
+To resolve, clear the token and re-authenticate:
496
+
497
+```bash
498
+planopticon auth google --logout
499
+planopticon auth google
500
+```
501
+
502
+### OAuth PKCE flow does not open a browser
503
+
504
+If the browser does not open automatically, PlanOpticon prints the authorization URL to the terminal. Copy and paste it into your browser manually. After authorizing, paste the authorization code back into the terminal prompt.
505
+
506
+### "requests not installed"
507
+
508
+The OAuth flows require the `requests` library. It is included as a dependency of PlanOpticon, but if you installed PlanOpticon in a minimal environment, install it manually:
509
+
510
+```bash
511
+pip install requests
512
+```
513
+
514
+### Permission denied on token file
515
+
516
+PlanOpticon needs write access to `~/.planopticon/`. If the directory or token files have restrictive permissions, adjust them:
517
+
518
+```bash
519
+chmod 700 ~/.planopticon
520
+chmod 600 ~/.planopticon/*_token.json
521
+```
522
+
523
+### Microsoft authentication uses the `/common` tenant
524
+
525
+The default Microsoft OAuth configuration uses the `common` tenant endpoint (`login.microsoftonline.com/common/...`), which supports both personal Microsoft accounts and Azure AD organizational accounts. If your organization requires a specific tenant, you can create a custom `AuthConfig` with the tenant-specific URLs.
--- a/docs/guide/authentication.md
+++ b/docs/guide/authentication.md
@@ -0,0 +1,525 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/guide/authentication.md
+++ b/docs/guide/authentication.md
@@ -0,0 +1,525 @@
1 # Authentication
2
3 PlanOpticon uses a unified authentication system to connect with cloud services for fetching recordings, documents, and other content. The system is **OAuth-first**: it prefers OAuth 2.0 flows for security and token management, but falls back to API keys when OAuth is not configured.
4
5 ## Auth strategy overview
6
7 PlanOpticon supports six cloud services out of the box: Google, Dropbox, Zoom, Notion, GitHub, and Microsoft. Each service uses the same authentication chain, implemented through the `OAuthManager` class. You configure credentials once (via environment variables or directly), and PlanOpticon handles token acquisition, storage, refresh, and fallback automatically.
8
9 All authentication state is managed through the `planopticon auth` CLI command, the `/auth` companion REPL command, or programmatically via the Python API.
10
11 ## The auth chain
12
13 When you authenticate with a service, PlanOpticon tries the following methods in order. It stops at the first one that succeeds:
14
15 1. **Saved token** -- Checks `~/.planopticon/{service}_token.json` for a previously saved token. If the token has not expired, it is used immediately. If it has expired but a refresh token is available, PlanOpticon attempts an automatic token refresh.
16
17 2. **Client Credentials grant** (Server-to-Server) -- If an `account_id` is configured (e.g., `ZOOM_ACCOUNT_ID`), PlanOpticon attempts a client credentials grant. This is a non-interactive flow suitable for automated pipelines and server-side integrations. No browser is required.
18
19 3. **OAuth 2.0 Authorization Code with PKCE** (interactive) -- If a client ID is configured and OAuth endpoints are available, PlanOpticon initiates an interactive OAuth PKCE flow. It opens a browser to the service's authorization page, waits for you to paste the authorization code, and exchanges it for tokens. The tokens are saved for future use.
20
21 4. **API key fallback** -- If no OAuth method succeeds, PlanOpticon checks for a service-specific API key environment variable (e.g., `GITHUB_TOKEN`, `NOTION_API_KEY`). This is the simplest setup but may have reduced capabilities compared to OAuth.
22
23 If none of the four methods succeed, PlanOpticon returns an error with hints about which environment variables to set.
24
25 ## Token storage
26
27 Tokens are persisted as JSON files in `~/.planopticon/`:
28
29 ```
30 ~/.planopticon/
31 google_token.json
32 dropbox_token.json
33 zoom_token.json
34 notion_token.json
35 github_token.json
36 microsoft_token.json
37 ```
38
39 Each token file contains:
40
41 | Field | Description |
42 |-------|-------------|
43 | `access_token` | The current access token |
44 | `refresh_token` | Refresh token for automatic renewal (if provided by the service) |
45 | `expires_at` | Unix timestamp when the token expires (with a 60-second safety margin) |
46 | `client_id` | The client ID used for this token (for refresh) |
47 | `client_secret` | The client secret used (for refresh) |
48
49 The `~/.planopticon/` directory is created automatically on first use. Token files are overwritten on each successful authentication or refresh.
50
51 To remove a saved token, use `planopticon auth <service> --logout` or delete the file directly.
52
53 ## Supported services
54
55 ### Google
56
57 Google authentication provides access to Google Drive and Google Docs for fetching documents, recordings, and other content.
58
59 **Scopes requested:**
60
61 - `https://www.googleapis.com/auth/drive.readonly`
62 - `https://www.googleapis.com/auth/documents.readonly`
63
64 **Environment variables:**
65
66 | Variable | Required | Description |
67 |----------|----------|-------------|
68 | `GOOGLE_CLIENT_ID` | For OAuth | OAuth 2.0 Client ID from Google Cloud Console |
69 | `GOOGLE_CLIENT_SECRET` | For OAuth | OAuth 2.0 Client Secret |
70 | `GOOGLE_API_KEY` | Fallback | API key (limited access, no user-specific data) |
71
72 **OAuth app setup:**
73
74 1. Go to the [Google Cloud Console](https://console.cloud.google.com/).
75 2. Create a project (or select an existing one).
76 3. Navigate to **APIs & Services > Credentials**.
77 4. Click **Create Credentials > OAuth client ID**.
78 5. Choose **Desktop app** as the application type.
79 6. Copy the Client ID and Client Secret.
80 7. Under **APIs & Services > Library**, enable the **Google Drive API** and **Google Docs API**.
81 8. Set the environment variables:
82
83 ```bash
84 export GOOGLE_CLIENT_ID="your-client-id.apps.googleusercontent.com"
85 export GOOGLE_CLIENT_SECRET="your-client-secret"
86 ```
87
88 **Service account fallback:** For automated pipelines, you can use a Google service account instead of OAuth. Generate a service account key JSON file from the Google Cloud Console and set `GOOGLE_APPLICATION_CREDENTIALS` to point to it. The PlanOpticon Google Workspace connector (`planopticon gws`) uses the `gws` CLI which has its own auth flow via `gws auth login`.
89
90 ### Dropbox
91
92 Dropbox authentication provides access to files stored in Dropbox.
93
94 **Environment variables:**
95
96 | Variable | Required | Description |
97 |----------|----------|-------------|
98 | `DROPBOX_APP_KEY` | For OAuth | App key from the Dropbox App Console |
99 | `DROPBOX_APP_SECRET` | For OAuth | App secret |
100 | `DROPBOX_ACCESS_TOKEN` | Fallback | Long-lived access token (for quick setup) |
101
102 **OAuth app setup:**
103
104 1. Go to the [Dropbox App Console](https://www.dropbox.com/developers/apps).
105 2. Click **Create App**.
106 3. Choose **Scoped access** and **Full Dropbox** (or **App folder** for restricted access).
107 4. Copy the App key and App secret from the **Settings** tab.
108 5. Set the environment variables:
109
110 ```bash
111 export DROPBOX_APP_KEY="your-app-key"
112 export DROPBOX_APP_SECRET="your-app-secret"
113 ```
114
115 **Access token shortcut:** For quick testing, you can generate an access token directly from the app's Settings page in the Dropbox App Console and set it as `DROPBOX_ACCESS_TOKEN`. This bypasses OAuth entirely but the token may have a limited lifetime.
116
117 ### Zoom
118
119 Zoom authentication provides access to cloud recordings, meeting metadata, and transcripts.
120
121 **Environment variables:**
122
123 | Variable | Required | Description |
124 |----------|----------|-------------|
125 | `ZOOM_CLIENT_ID` | For OAuth | OAuth client ID from the Zoom Marketplace |
126 | `ZOOM_CLIENT_SECRET` | For OAuth | OAuth client secret |
127 | `ZOOM_ACCOUNT_ID` | For S2S | Account ID for Server-to-Server OAuth |
128
129 **Server-to-Server (recommended for automation):**
130
131 When `ZOOM_ACCOUNT_ID` is set alongside `ZOOM_CLIENT_ID` and `ZOOM_CLIENT_SECRET`, PlanOpticon uses the client credentials grant (Server-to-Server OAuth). This is non-interactive and ideal for CI/CD pipelines and scheduled jobs.
132
133 1. Go to the [Zoom Marketplace](https://marketplace.zoom.us/).
134 2. Click **Develop > Build App**.
135 3. Choose **Server-to-Server OAuth**.
136 4. Copy the Account ID, Client ID, and Client Secret.
137 5. Add the required scopes: `recording:read:admin` (or `recording:read`).
138 6. Set the environment variables:
139
140 ```bash
141 export ZOOM_CLIENT_ID="your-client-id"
142 export ZOOM_CLIENT_SECRET="your-client-secret"
143 export ZOOM_ACCOUNT_ID="your-account-id"
144 ```
145
146 **User-level OAuth PKCE:**
147
148 If `ZOOM_ACCOUNT_ID` is not set, PlanOpticon falls back to the interactive OAuth PKCE flow. This opens a browser window for the user to authorize access.
149
150 1. In the Zoom Marketplace, create a **General App** (or **OAuth** app).
151 2. Set the redirect URI to `urn:ietf:wg:oauth:2.0:oob` (out-of-band).
152 3. Copy the Client ID and Client Secret.
153
154 ### Notion
155
156 Notion authentication provides access to pages, databases, and content in your Notion workspace.
157
158 **Environment variables:**
159
160 | Variable | Required | Description |
161 |----------|----------|-------------|
162 | `NOTION_CLIENT_ID` | For OAuth | OAuth client ID from the Notion Integrations page |
163 | `NOTION_CLIENT_SECRET` | For OAuth | OAuth client secret |
164 | `NOTION_API_KEY` | Fallback | Internal integration token |
165
166 **OAuth app setup:**
167
168 1. Go to [My Integrations](https://www.notion.so/my-integrations) in Notion.
169 2. Click **New integration**.
170 3. Select **Public integration** (required for OAuth).
171 4. Copy the OAuth Client ID and Client Secret.
172 5. Set the redirect URI.
173 6. Set the environment variables:
174
175 ```bash
176 export NOTION_CLIENT_ID="your-client-id"
177 export NOTION_CLIENT_SECRET="your-client-secret"
178 ```
179
180 **Internal integration (API key fallback):**
181
182 For simpler setups, create an **Internal integration** from the Notion Integrations page. Copy the integration token and set it as `NOTION_API_KEY`. You must also share the relevant Notion pages/databases with the integration.
183
184 ```bash
185 export NOTION_API_KEY="ntn_your-integration-token"
186 ```
187
188 ### GitHub
189
190 GitHub authentication provides access to repositories, issues, and organization data.
191
192 **Scopes requested:**
193
194 - `repo`
195 - `read:org`
196
197 **Environment variables:**
198
199 | Variable | Required | Description |
200 |----------|----------|-------------|
201 | `GITHUB_CLIENT_ID` | For OAuth | OAuth App client ID |
202 | `GITHUB_CLIENT_SECRET` | For OAuth | OAuth App client secret |
203 | `GITHUB_TOKEN` | Fallback | Personal access token (classic or fine-grained) |
204
205 **OAuth app setup:**
206
207 1. Go to **GitHub > Settings > Developer Settings > OAuth Apps**.
208 2. Click **New OAuth App**.
209 3. Set the Authorization callback URL to `urn:ietf:wg:oauth:2.0:oob`.
210 4. Copy the Client ID and generate a Client Secret.
211 5. Set the environment variables:
212
213 ```bash
214 export GITHUB_CLIENT_ID="your-client-id"
215 export GITHUB_CLIENT_SECRET="your-client-secret"
216 ```
217
218 **Personal access token (recommended for most users):**
219
220 The simplest approach is to create a Personal Access Token:
221
222 1. Go to **GitHub > Settings > Developer Settings > Personal Access Tokens**.
223 2. Generate a token with `repo` and `read:org` scopes.
224 3. Set it as `GITHUB_TOKEN`:
225
226 ```bash
227 export GITHUB_TOKEN="ghp_your-token"
228 ```
229
230 ### Microsoft
231
232 Microsoft authentication provides access to Microsoft 365 resources via the Microsoft Graph API, including OneDrive, SharePoint, and Teams recordings.
233
234 **Scopes requested:**
235
236 - `https://graph.microsoft.com/OnlineMeetings.Read`
237 - `https://graph.microsoft.com/Files.Read`
238
239 **Environment variables:**
240
241 | Variable | Required | Description |
242 |----------|----------|-------------|
243 | `MICROSOFT_CLIENT_ID` | For OAuth | Application (client) ID from Azure AD |
244 | `MICROSOFT_CLIENT_SECRET` | For OAuth | Client secret from Azure AD |
245
246 **Azure AD app registration:**
247
248 1. Go to the [Azure Portal](https://portal.azure.com/).
249 2. Navigate to **Azure Active Directory > App registrations**.
250 3. Click **New registration**.
251 4. Name the application (e.g., "PlanOpticon").
252 5. Under **Supported account types**, select the appropriate option for your organization.
253 6. Set the redirect URI to `urn:ietf:wg:oauth:2.0:oob` with platform **Mobile and desktop applications**.
254 7. After registration, go to **Certificates & secrets** and create a new client secret.
255 8. Under **API permissions**, add:
256 - `OnlineMeetings.Read`
257 - `Files.Read`
258 9. Grant admin consent if required by your organization.
259 10. Set the environment variables:
260
261 ```bash
262 export MICROSOFT_CLIENT_ID="your-application-id"
263 export MICROSOFT_CLIENT_SECRET="your-client-secret"
264 ```
265
266 **Microsoft 365 CLI:** The `planopticon m365` commands use the `@pnp/cli-microsoft365` npm package, which has its own authentication flow via `m365 login`. This is separate from the OAuth flow described above.
267
268 ## CLI usage
269
270 ### `planopticon auth`
271
272 Authenticate with a cloud service or manage saved tokens.
273
274 ```
275 planopticon auth SERVICE [--logout]
276 ```
277
278 **Arguments:**
279
280 | Argument | Description |
281 |----------|-------------|
282 | `SERVICE` | One of: `google`, `dropbox`, `zoom`, `notion`, `github`, `microsoft` |
283
284 **Options:**
285
286 | Option | Description |
287 |--------|-------------|
288 | `--logout` | Clear the saved token for the specified service |
289
290 **Examples:**
291
292 ```bash
293 # Authenticate with Google (triggers OAuth flow or uses saved token)
294 planopticon auth google
295
296 # Authenticate with Zoom
297 planopticon auth zoom
298
299 # Clear saved GitHub token
300 planopticon auth github --logout
301 ```
302
303 On success, the command prints the authentication method used:
304
305 ```
306 Google authentication successful (oauth_pkce).
307 ```
308
309 or
310
311 ```
312 Github authentication successful (api_key).
313 ```
314
315 ### Companion REPL `/auth`
316
317 Inside the interactive companion REPL (`planopticon -C` or `planopticon -I`), you can authenticate with services using the `/auth` command:
318
319 ```
320 /auth SERVICE
321 ```
322
323 Without arguments, `/auth` lists all available services:
324
325 ```
326 > /auth
327 Usage: /auth SERVICE
328 Available: dropbox, github, google, microsoft, notion, zoom
329 ```
330
331 With a service name, it runs the same auth chain as the CLI command:
332
333 ```
334 > /auth github
335 Github authentication successful (api_key).
336 ```
337
338 ## Environment variables reference
339
340 The following table summarizes all environment variables used by the authentication system:
341
342 | Service | OAuth Client ID | OAuth Client Secret | API Key / Token | Account ID |
343 |---------|----------------|--------------------|--------------------|------------|
344 | Google | `GOOGLE_CLIENT_ID` | `GOOGLE_CLIENT_SECRET` | `GOOGLE_API_KEY` | -- |
345 | Dropbox | `DROPBOX_APP_KEY` | `DROPBOX_APP_SECRET` | `DROPBOX_ACCESS_TOKEN` | -- |
346 | Zoom | `ZOOM_CLIENT_ID` | `ZOOM_CLIENT_SECRET` | -- | `ZOOM_ACCOUNT_ID` |
347 | Notion | `NOTION_CLIENT_ID` | `NOTION_CLIENT_SECRET` | `NOTION_API_KEY` | -- |
348 | GitHub | `GITHUB_CLIENT_ID` | `GITHUB_CLIENT_SECRET` | `GITHUB_TOKEN` | -- |
349 | Microsoft | `MICROSOFT_CLIENT_ID` | `MICROSOFT_CLIENT_SECRET` | -- | -- |
350
351 ## Python API
352
353 ### AuthConfig
354
355 The `AuthConfig` dataclass defines the authentication configuration for a service. It holds OAuth endpoints, credential references, scopes, and token storage paths.
356
357 ```python
358 from video_processor.auth import AuthConfig
359
360 config = AuthConfig(
361 service="myservice",
362 oauth_authorize_url="https://example.com/oauth/authorize",
363 oauth_token_url="https://example.com/oauth/token",
364 client_id_env="MYSERVICE_CLIENT_ID",
365 client_secret_env="MYSERVICE_CLIENT_SECRET",
366 api_key_env="MYSERVICE_API_KEY",
367 scopes=["read", "write"],
368 )
369 ```
370
371 **Key fields:**
372
373 | Field | Type | Description |
374 |-------|------|-------------|
375 | `service` | `str` | Service identifier (used for token filename) |
376 | `oauth_authorize_url` | `Optional[str]` | OAuth authorization endpoint |
377 | `oauth_token_url` | `Optional[str]` | OAuth token endpoint |
378 | `client_id` / `client_id_env` | `Optional[str]` | Client ID value or env var name |
379 | `client_secret` / `client_secret_env` | `Optional[str]` | Client secret value or env var name |
380 | `api_key_env` | `Optional[str]` | Environment variable for API key fallback |
381 | `scopes` | `List[str]` | OAuth scopes to request |
382 | `redirect_uri` | `str` | Redirect URI (default: `urn:ietf:wg:oauth:2.0:oob`) |
383 | `account_id` / `account_id_env` | `Optional[str]` | Account ID for client credentials grant |
384 | `token_path` | `Optional[Path]` | Override token storage path |
385
386 **Resolved properties:**
387
388 - `resolved_client_id` -- Returns the client ID from the direct value or environment variable.
389 - `resolved_client_secret` -- Returns the client secret from the direct value or environment variable.
390 - `resolved_api_key` -- Returns the API key from the environment variable.
391 - `resolved_account_id` -- Returns the account ID from the direct value or environment variable.
392 - `resolved_token_path` -- Returns the token file path (default: `~/.planopticon/{service}_token.json`).
393 - `supports_oauth` -- Returns `True` if both OAuth endpoints are configured.
394
395 ### OAuthManager
396
397 The `OAuthManager` class manages the full authentication lifecycle for a service.
398
399 ```python
400 from video_processor.auth import OAuthManager, AuthConfig
401
402 config = AuthConfig(
403 service="notion",
404 oauth_authorize_url="https://api.notion.com/v1/oauth/authorize",
405 oauth_token_url="https://api.notion.com/v1/oauth/token",
406 client_id_env="NOTION_CLIENT_ID",
407 client_secret_env="NOTION_CLIENT_SECRET",
408 api_key_env="NOTION_API_KEY",
409 scopes=["read_content"],
410 )
411 manager = OAuthManager(config)
412
413 # Full auth chain -- returns AuthResult
414 result = manager.authenticate()
415 if result.success:
416 print(f"Authenticated via {result.method}")
417 print(f"Token: {result.access_token[:20]}...")
418
419 # Convenience method -- returns just the token string or None
420 token = manager.get_token()
421
422 # Clear saved token (logout)
423 manager.clear_token()
424 ```
425
426 **AuthResult fields:**
427
428 | Field | Type | Description |
429 |-------|------|-------------|
430 | `success` | `bool` | Whether authentication succeeded |
431 | `access_token` | `Optional[str]` | The access token (if successful) |
432 | `method` | `Optional[str]` | One of: `saved_token`, `oauth_pkce`, `client_credentials`, `api_key` |
433 | `expires_at` | `Optional[float]` | Token expiry as a Unix timestamp |
434 | `refresh_token` | `Optional[str]` | Refresh token (if provided) |
435 | `error` | `Optional[str]` | Error message (if unsuccessful) |
436
437 ### Pre-built configs
438
439 PlanOpticon ships with pre-built `AuthConfig` instances for all six supported services. Access them via convenience functions:
440
441 ```python
442 from video_processor.auth import get_auth_config, get_auth_manager
443
444 # Get just the config
445 config = get_auth_config("zoom")
446
447 # Get a ready-to-use manager
448 manager = get_auth_manager("github")
449 token = manager.get_token()
450 ```
451
452 ### Building custom connectors
453
454 To add authentication for a new service, create an `AuthConfig` with the service's OAuth endpoints and credential environment variables:
455
456 ```python
457 from video_processor.auth import AuthConfig, OAuthManager
458
459 config = AuthConfig(
460 service="slack",
461 oauth_authorize_url="https://slack.com/oauth/v2/authorize",
462 oauth_token_url="https://slack.com/api/oauth.v2.access",
463 client_id_env="SLACK_CLIENT_ID",
464 client_secret_env="SLACK_CLIENT_SECRET",
465 api_key_env="SLACK_BOT_TOKEN",
466 scopes=["channels:read", "channels:history"],
467 )
468
469 manager = OAuthManager(config)
470 result = manager.authenticate()
471 ```
472
473 The token will be saved to `~/.planopticon/slack_token.json` and automatically refreshed on subsequent calls.
474
475 ## Troubleshooting
476
477 ### "No auth method available for {service}"
478
479 This means none of the four auth methods succeeded. Check that:
480
481 - The required environment variables are set and non-empty.
482 - For OAuth: both the client ID and client secret (or app key/secret) are set.
483 - For API key fallback: the correct environment variable is set.
484
485 The error message includes hints about which variables to set.
486
487 ### Token refresh fails
488
489 If automatic token refresh fails, PlanOpticon falls back to the next auth method in the chain. Common causes:
490
491 - The refresh token has been revoked (e.g., you changed your password or revoked app access).
492 - The OAuth app's client secret has changed.
493 - The service requires re-authorization after a certain period.
494
495 To resolve, clear the token and re-authenticate:
496
497 ```bash
498 planopticon auth google --logout
499 planopticon auth google
500 ```
501
502 ### OAuth PKCE flow does not open a browser
503
504 If the browser does not open automatically, PlanOpticon prints the authorization URL to the terminal. Copy and paste it into your browser manually. After authorizing, paste the authorization code back into the terminal prompt.
505
506 ### "requests not installed"
507
508 The OAuth flows require the `requests` library. It is included as a dependency of PlanOpticon, but if you installed PlanOpticon in a minimal environment, install it manually:
509
510 ```bash
511 pip install requests
512 ```
513
514 ### Permission denied on token file
515
516 PlanOpticon needs write access to `~/.planopticon/`. If the directory or token files have restrictive permissions, adjust them:
517
518 ```bash
519 chmod 700 ~/.planopticon
520 chmod 600 ~/.planopticon/*_token.json
521 ```
522
523 ### Microsoft authentication uses the `/common` tenant
524
525 The default Microsoft OAuth configuration uses the `common` tenant endpoint (`login.microsoftonline.com/common/...`), which supports both personal Microsoft accounts and Azure AD organizational accounts. If your organization requires a specific tenant, you can create a custom `AuthConfig` with the tenant-specific URLs.
+120 -10
--- docs/guide/batch.md
+++ docs/guide/batch.md
@@ -10,11 +10,11 @@
1010
1111
Batch mode:
1212
1313
1. Scans the input directory for video files matching the pattern
1414
2. Processes each video through the full single-video pipeline
15
-3. Merges knowledge graphs across all videos (case-insensitive entity dedup)
15
+3. Merges knowledge graphs across all videos with fuzzy matching and conflict resolution
1616
4. Generates a batch summary with aggregated stats and action items
1717
5. Writes a batch manifest linking to per-video results
1818
1919
## File patterns
2020
@@ -30,32 +30,58 @@
3030
3131
```
3232
output/
3333
├── batch_manifest.json # Batch-level manifest
3434
├── batch_summary.md # Aggregated summary
35
-├── knowledge_graph.json # Merged KG across all videos
35
+├── knowledge_graph.db # Merged KG across all videos (SQLite, primary)
36
+├── knowledge_graph.json # Merged KG across all videos (JSON export)
3637
└── videos/
3738
├── meeting-01/
3839
│ ├── manifest.json
3940
│ ├── transcript/
4041
│ ├── diagrams/
42
+ │ ├── captures/
4143
│ └── results/
44
+ │ ├── analysis.md
45
+ │ ├── analysis.html
46
+ │ ├── knowledge_graph.db
47
+ │ ├── knowledge_graph.json
48
+ │ ├── key_points.json
49
+ │ └── action_items.json
4250
└── meeting-02/
4351
├── manifest.json
4452
└── ...
4553
```
4654
4755
## Knowledge graph merging
4856
49
-When the same entity appears across multiple videos, PlanOpticon merges them:
50
-
51
-- Case-insensitive name matching
52
-- Descriptions are unioned
53
-- Occurrences are concatenated with source tracking
54
-- Relationships are deduplicated
55
-
56
-The merged knowledge graph is saved at the batch root and included in the batch summary as a mermaid diagram.
57
+When the same entity appears across multiple videos, PlanOpticon merges them using a multi-strategy approach:
58
+
59
+### Entity deduplication
60
+
61
+- **Case-insensitive exact matching** -- `"kubernetes"` and `"Kubernetes"` are recognized as the same entity
62
+- **Fuzzy name matching** -- Uses `SequenceMatcher` with a threshold of 0.85 to unify near-duplicate entities (e.g., `"K8s"` and `"k8s cluster"` may be matched depending on context)
63
+- **Descriptions are unioned** -- All unique descriptions from each video are combined
64
+- **Occurrences are concatenated with source tracking** -- Each occurrence retains its source video reference
65
+
66
+### Relationship deduplication
67
+
68
+- Relationships are deduplicated by (source, target, type) tuple
69
+- Descriptions from duplicate relationships are merged
70
+
71
+### Type conflict resolution
72
+
73
+When the same entity appears with different types across videos, PlanOpticon uses a specificity ranking to resolve the conflict. More specific types are preferred over general ones:
74
+
75
+- `technology` > `concept`
76
+- `person` > `concept`
77
+- `organization` > `concept`
78
+- And so on through the full type hierarchy
79
+
80
+This ensures that an entity initially classified as a generic `concept` in one video gets upgraded to `technology` if it is identified more specifically in another.
81
+
82
+The merged knowledge graph is saved at the batch root in both SQLite (`knowledge_graph.db`) and JSON (`knowledge_graph.json`) formats, and is included in the batch summary as a Mermaid diagram.
5783
5884
## Error handling
5985
6086
If a video fails to process, the batch continues. Failed videos are recorded in the batch manifest with error details:
6187
@@ -64,5 +90,89 @@
6490
"video_name": "corrupted-file",
6591
"status": "failed",
6692
"error": "Audio extraction failed: no audio track found"
6793
}
6894
```
95
+
96
+The batch manifest tracks completion status:
97
+
98
+```json
99
+{
100
+ "title": "Sprint Reviews",
101
+ "total_videos": 5,
102
+ "completed_videos": 4,
103
+ "failed_videos": 1,
104
+ "total_diagrams": 12,
105
+ "total_action_items": 23,
106
+ "total_key_points": 45,
107
+ "videos": [...],
108
+ "batch_summary_md": "batch_summary.md",
109
+ "merged_knowledge_graph_json": "knowledge_graph.json",
110
+ "merged_knowledge_graph_db": "knowledge_graph.db"
111
+}
112
+```
113
+
114
+## Using batch results
115
+
116
+### Query the merged knowledge graph
117
+
118
+After batch processing completes, the merged knowledge graph at the batch root contains entities and relationships from all successfully processed videos. You can query it just like a single-video knowledge graph:
119
+
120
+```bash
121
+# Show stats for the merged graph
122
+planopticon query --db output/knowledge_graph.db
123
+
124
+# List all people mentioned across all videos
125
+planopticon query --db output/knowledge_graph.db "entities --type person"
126
+
127
+# See what connects to an entity across all videos
128
+planopticon query --db output/knowledge_graph.db "neighbors Alice"
129
+
130
+# Ask natural language questions about the combined content
131
+planopticon query --db output/knowledge_graph.db "What technologies were discussed across all meetings?"
132
+
133
+# Interactive REPL for exploration
134
+planopticon query --db output/knowledge_graph.db -I
135
+```
136
+
137
+### Export merged results
138
+
139
+All export commands work with the merged knowledge graph:
140
+
141
+```bash
142
+# Generate documents from merged KG
143
+planopticon export markdown output/knowledge_graph.db -o ./docs
144
+
145
+# Export as Obsidian vault
146
+planopticon export obsidian output/knowledge_graph.db -o ./vault
147
+
148
+# Generate a project-wide exchange file
149
+planopticon export exchange output/knowledge_graph.db --name "Sprint Reviews Q4"
150
+
151
+# Generate a GitHub wiki
152
+planopticon wiki generate output/knowledge_graph.db -o ./wiki
153
+```
154
+
155
+### Classify for planning
156
+
157
+Run taxonomy classification on the merged graph to categorize entities across all videos:
158
+
159
+```bash
160
+planopticon kg classify output/knowledge_graph.db
161
+```
162
+
163
+### Use with the planning agent
164
+
165
+The planning agent can consume the merged knowledge graph for cross-video analysis and planning:
166
+
167
+```bash
168
+planopticon agent --db output/knowledge_graph.db
169
+```
170
+
171
+### Incremental batch processing
172
+
173
+If you add new videos to the recordings directory, you can re-run the batch command. Videos that have already been processed (with output directories present) will be detected via checkpoint/resume within each video's pipeline, making incremental processing efficient.
174
+
175
+```bash
176
+# Add new recordings to the folder, then re-run
177
+planopticon batch -i ./recordings -o ./output --title "Sprint Reviews"
178
+```
69179
70180
ADDED docs/guide/companion.md
71181
ADDED docs/guide/document-ingestion.md
72182
ADDED docs/guide/export.md
73183
ADDED docs/guide/knowledge-graphs.md
--- docs/guide/batch.md
+++ docs/guide/batch.md
@@ -10,11 +10,11 @@
10
11 Batch mode:
12
13 1. Scans the input directory for video files matching the pattern
14 2. Processes each video through the full single-video pipeline
15 3. Merges knowledge graphs across all videos (case-insensitive entity dedup)
16 4. Generates a batch summary with aggregated stats and action items
17 5. Writes a batch manifest linking to per-video results
18
19 ## File patterns
20
@@ -30,32 +30,58 @@
30
31 ```
32 output/
33 ├── batch_manifest.json # Batch-level manifest
34 ├── batch_summary.md # Aggregated summary
35 ├── knowledge_graph.json # Merged KG across all videos
 
36 └── videos/
37 ├── meeting-01/
38 │ ├── manifest.json
39 │ ├── transcript/
40 │ ├── diagrams/
 
41 │ └── results/
 
 
 
 
 
 
42 └── meeting-02/
43 ├── manifest.json
44 └── ...
45 ```
46
47 ## Knowledge graph merging
48
49 When the same entity appears across multiple videos, PlanOpticon merges them:
50
51 - Case-insensitive name matching
52 - Descriptions are unioned
53 - Occurrences are concatenated with source tracking
54 - Relationships are deduplicated
55
56 The merged knowledge graph is saved at the batch root and included in the batch summary as a mermaid diagram.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
58 ## Error handling
59
60 If a video fails to process, the batch continues. Failed videos are recorded in the batch manifest with error details:
61
@@ -64,5 +90,89 @@
64 "video_name": "corrupted-file",
65 "status": "failed",
66 "error": "Audio extraction failed: no audio track found"
67 }
68 ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
70 DDED docs/guide/companion.md
71 DDED docs/guide/document-ingestion.md
72 DDED docs/guide/export.md
73 DDED docs/guide/knowledge-graphs.md
--- docs/guide/batch.md
+++ docs/guide/batch.md
@@ -10,11 +10,11 @@
10
11 Batch mode:
12
13 1. Scans the input directory for video files matching the pattern
14 2. Processes each video through the full single-video pipeline
15 3. Merges knowledge graphs across all videos with fuzzy matching and conflict resolution
16 4. Generates a batch summary with aggregated stats and action items
17 5. Writes a batch manifest linking to per-video results
18
19 ## File patterns
20
@@ -30,32 +30,58 @@
30
31 ```
32 output/
33 ├── batch_manifest.json # Batch-level manifest
34 ├── batch_summary.md # Aggregated summary
35 ├── knowledge_graph.db # Merged KG across all videos (SQLite, primary)
36 ├── knowledge_graph.json # Merged KG across all videos (JSON export)
37 └── videos/
38 ├── meeting-01/
39 │ ├── manifest.json
40 │ ├── transcript/
41 │ ├── diagrams/
42 │ ├── captures/
43 │ └── results/
44 │ ├── analysis.md
45 │ ├── analysis.html
46 │ ├── knowledge_graph.db
47 │ ├── knowledge_graph.json
48 │ ├── key_points.json
49 │ └── action_items.json
50 └── meeting-02/
51 ├── manifest.json
52 └── ...
53 ```
54
55 ## Knowledge graph merging
56
57 When the same entity appears across multiple videos, PlanOpticon merges them using a multi-strategy approach:
58
59 ### Entity deduplication
60
61 - **Case-insensitive exact matching** -- `"kubernetes"` and `"Kubernetes"` are recognized as the same entity
62 - **Fuzzy name matching** -- Uses `SequenceMatcher` with a threshold of 0.85 to unify near-duplicate entities (e.g., `"K8s"` and `"k8s cluster"` may be matched depending on context)
63 - **Descriptions are unioned** -- All unique descriptions from each video are combined
64 - **Occurrences are concatenated with source tracking** -- Each occurrence retains its source video reference
65
66 ### Relationship deduplication
67
68 - Relationships are deduplicated by (source, target, type) tuple
69 - Descriptions from duplicate relationships are merged
70
71 ### Type conflict resolution
72
73 When the same entity appears with different types across videos, PlanOpticon uses a specificity ranking to resolve the conflict. More specific types are preferred over general ones:
74
75 - `technology` > `concept`
76 - `person` > `concept`
77 - `organization` > `concept`
78 - And so on through the full type hierarchy
79
80 This ensures that an entity initially classified as a generic `concept` in one video gets upgraded to `technology` if it is identified more specifically in another.
81
82 The merged knowledge graph is saved at the batch root in both SQLite (`knowledge_graph.db`) and JSON (`knowledge_graph.json`) formats, and is included in the batch summary as a Mermaid diagram.
83
84 ## Error handling
85
86 If a video fails to process, the batch continues. Failed videos are recorded in the batch manifest with error details:
87
@@ -64,5 +90,89 @@
90 "video_name": "corrupted-file",
91 "status": "failed",
92 "error": "Audio extraction failed: no audio track found"
93 }
94 ```
95
96 The batch manifest tracks completion status:
97
98 ```json
99 {
100 "title": "Sprint Reviews",
101 "total_videos": 5,
102 "completed_videos": 4,
103 "failed_videos": 1,
104 "total_diagrams": 12,
105 "total_action_items": 23,
106 "total_key_points": 45,
107 "videos": [...],
108 "batch_summary_md": "batch_summary.md",
109 "merged_knowledge_graph_json": "knowledge_graph.json",
110 "merged_knowledge_graph_db": "knowledge_graph.db"
111 }
112 ```
113
114 ## Using batch results
115
116 ### Query the merged knowledge graph
117
118 After batch processing completes, the merged knowledge graph at the batch root contains entities and relationships from all successfully processed videos. You can query it just like a single-video knowledge graph:
119
120 ```bash
121 # Show stats for the merged graph
122 planopticon query --db output/knowledge_graph.db
123
124 # List all people mentioned across all videos
125 planopticon query --db output/knowledge_graph.db "entities --type person"
126
127 # See what connects to an entity across all videos
128 planopticon query --db output/knowledge_graph.db "neighbors Alice"
129
130 # Ask natural language questions about the combined content
131 planopticon query --db output/knowledge_graph.db "What technologies were discussed across all meetings?"
132
133 # Interactive REPL for exploration
134 planopticon query --db output/knowledge_graph.db -I
135 ```
136
137 ### Export merged results
138
139 All export commands work with the merged knowledge graph:
140
141 ```bash
142 # Generate documents from merged KG
143 planopticon export markdown output/knowledge_graph.db -o ./docs
144
145 # Export as Obsidian vault
146 planopticon export obsidian output/knowledge_graph.db -o ./vault
147
148 # Generate a project-wide exchange file
149 planopticon export exchange output/knowledge_graph.db --name "Sprint Reviews Q4"
150
151 # Generate a GitHub wiki
152 planopticon wiki generate output/knowledge_graph.db -o ./wiki
153 ```
154
155 ### Classify for planning
156
157 Run taxonomy classification on the merged graph to categorize entities across all videos:
158
159 ```bash
160 planopticon kg classify output/knowledge_graph.db
161 ```
162
163 ### Use with the planning agent
164
165 The planning agent can consume the merged knowledge graph for cross-video analysis and planning:
166
167 ```bash
168 planopticon agent --db output/knowledge_graph.db
169 ```
170
171 ### Incremental batch processing
172
173 If you add new videos to the recordings directory, you can re-run the batch command. Videos that have already been processed (with output directories present) will be detected via checkpoint/resume within each video's pipeline, making incremental processing efficient.
174
175 ```bash
176 # Add new recordings to the folder, then re-run
177 planopticon batch -i ./recordings -o ./output --title "Sprint Reviews"
178 ```
179
180 DDED docs/guide/companion.md
181 DDED docs/guide/document-ingestion.md
182 DDED docs/guide/export.md
183 DDED docs/guide/knowledge-graphs.md
--- a/docs/guide/companion.md
+++ b/docs/guide/companion.md
@@ -0,0 +1,531 @@
1
+# Interactive Companion REPL
2
+
3
+The PlanOpticon Companion is an interactive Read-Eval-Print Loop (REPL) that provides a conversational interface to PlanOpticon's full feature set. It combines workspace awareness, knowledge graph querying, LLM-powered chat, and planning agent skills into a single session.
4
+
5
+Use the Companion when you want to explore a knowledge graph interactively, ask natural-language questions about extracted content, generate planning artifacts on the fly, or switch between providers and models without restarting.
6
+
7
+---
8
+
9
+## Launching the Companion
10
+
11
+There are three equivalent ways to start the Companion.
12
+
13
+### As a subcommand
14
+
15
+```bash
16
+planopticon companion
17
+```
18
+
19
+### With the `--chat` / `-C` flag
20
+
21
+```bash
22
+planopticon --chat
23
+planopticon -C
24
+```
25
+
26
+These flags launch the Companion directly from the top-level CLI, without invoking a subcommand.
27
+
28
+### With options
29
+
30
+The `companion` subcommand accepts options for specifying knowledge base paths, LLM provider, and model:
31
+
32
+```bash
33
+# Point at a specific knowledge base
34
+planopticon companion --kb ./results
35
+
36
+# Use a specific provider
37
+planopticon companion -p anthropic
38
+
39
+# Use a specific model
40
+planopticon companion --chat-model gpt-4o
41
+
42
+# Combine options
43
+planopticon companion --kb ./results -p openai --chat-model gpt-4o
44
+```
45
+
46
+| Option | Description |
47
+|---|---|
48
+| `--kb PATH` | Path to a knowledge graph file or directory (repeatable) |
49
+| `-p, --provider NAME` | LLM provider: `auto`, `openai`, `anthropic`, `gemini`, `ollama`, `azure`, `together`, `fireworks`, `cerebras`, `xai` |
50
+| `--chat-model NAME` | Override the default chat model for the selected provider |
51
+
52
+---
53
+
54
+## Auto-discovery
55
+
56
+On startup, the Companion automatically scans the workspace for relevant files:
57
+
58
+**Knowledge graphs.** The Companion uses `find_nearest_graph()` to locate the closest `knowledge_graph.db` or `knowledge_graph.json` file. It searches the current directory, common output subdirectories (`results/`, `output/`, `knowledge-base/`), recursively downward (up to 4 levels), and upward through parent directories. SQLite `.db` files are preferred over `.json` files.
59
+
60
+**Videos.** The current directory is scanned for files with `.mp4`, `.mkv`, and `.webm` extensions.
61
+
62
+**Documents.** The current directory is scanned for files with `.md`, `.pdf`, and `.docx` extensions.
63
+
64
+**LLM provider.** If `--provider` is set to `auto` (the default), the Companion attempts to initialise a provider using any available API key in the environment (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, etc.).
65
+
66
+All discovered context is displayed in the welcome banner:
67
+
68
+```
69
+ PlanOpticon Companion
70
+ Interactive planning REPL
71
+
72
+ Knowledge graph: knowledge_graph.db (42 entities, 87 relationships)
73
+ Videos: meeting-2024-01-15.mp4, sprint-review.mp4
74
+ Docs: requirements.md, architecture.pdf
75
+ LLM provider: openai (model: gpt-4o)
76
+
77
+ Type /help for commands, or ask a question.
78
+```
79
+
80
+If no knowledge graph is found, the banner shows "No knowledge graph loaded." Commands that require a KG will return an appropriate message rather than failing silently.
81
+
82
+---
83
+
84
+## Slash Commands
85
+
86
+The Companion supports 18 slash commands. Type `/help` at the prompt to see the full list.
87
+
88
+### /help
89
+
90
+Display all available commands with brief descriptions.
91
+
92
+```
93
+planopticon> /help
94
+Available commands:
95
+ /help Show this help
96
+ /status Workspace status
97
+ /skills List available skills
98
+ /entities [--type T] List KG entities
99
+ /search TERM Search entities by name
100
+ /neighbors ENTITY Show entity relationships
101
+ /export FORMAT Export KG (markdown, obsidian, notion, csv)
102
+ /analyze PATH Analyze a video/doc
103
+ /ingest PATH Ingest a file into the KG
104
+ /auth SERVICE Authenticate with a cloud service
105
+ /provider [NAME] List or switch LLM provider
106
+ /model [NAME] Show or switch chat model
107
+ /run SKILL Run a skill by name
108
+ /plan Run project_plan skill
109
+ /prd Run PRD skill
110
+ /tasks Run task_breakdown skill
111
+ /quit, /exit Exit companion
112
+
113
+Any other input is sent to the chat agent (requires LLM).
114
+```
115
+
116
+### /status
117
+
118
+Show a summary of the current workspace state: loaded knowledge graph (with entity and relationship counts, broken down by entity type), number of discovered videos and documents, and whether an LLM provider is active.
119
+
120
+```
121
+planopticon> /status
122
+Workspace status:
123
+ KG: /home/user/project/results/knowledge_graph.db (42 entities, 87 relationships)
124
+ technology: 15
125
+ person: 12
126
+ concept: 10
127
+ organization: 5
128
+ Videos: 2 found
129
+ Docs: 3 found
130
+ Provider: active
131
+```
132
+
133
+### /skills
134
+
135
+List all registered planning agent skills with their names and descriptions. These are the skills that can be invoked via `/run`.
136
+
137
+```
138
+planopticon> /skills
139
+Available skills:
140
+ project_plan: Generate a structured project plan from knowledge graph
141
+ prd: Generate a product requirements document (PRD) / feature spec
142
+ roadmap: Generate a product/project roadmap
143
+ task_breakdown: Break down goals into tasks with dependencies
144
+ github_issues: Generate GitHub issues from task breakdown
145
+ requirements_chat: Interactive requirements gathering via guided questions
146
+ doc_generator: Generate technical documentation, ADRs, or meeting notes
147
+ artifact_export: Export artifacts in agent-ready formats
148
+ cli_adapter: Push artifacts to external tools via their CLIs
149
+ notes_export: Export knowledge graph as structured notes (Obsidian, Notion)
150
+ wiki_generator: Generate a GitHub wiki from knowledge graph and artifacts
151
+```
152
+
153
+### /entities [--type TYPE]
154
+
155
+List entities from the loaded knowledge graph. Optionally filter by entity type.
156
+
157
+```
158
+planopticon> /entities
159
+Found 42 entities
160
+ [technology] Python -- General-purpose programming language
161
+ [person] Alice -- Lead engineer on the project
162
+ [concept] Microservices -- Architectural pattern discussed
163
+ ...
164
+
165
+planopticon> /entities --type person
166
+Found 12 entities
167
+ [person] Alice -- Lead engineer on the project
168
+ [person] Bob -- Product manager
169
+ ...
170
+```
171
+
172
+!!! note
173
+ This command requires a loaded knowledge graph. If none is loaded, it returns "No knowledge graph loaded."
174
+
175
+### /search TERM
176
+
177
+Search entities by name substring (case-insensitive).
178
+
179
+```
180
+planopticon> /search python
181
+Found 3 entities
182
+ [technology] Python -- General-purpose programming language
183
+ [technology] Python Flask -- Web framework for Python
184
+ [concept] Python packaging -- Discussion of pip and packaging tools
185
+```
186
+
187
+### /neighbors ENTITY
188
+
189
+Show all entities and relationships connected to a given entity. This performs a breadth-first traversal (depth 1) from the named entity.
190
+
191
+```
192
+planopticon> /neighbors Alice
193
+Found 4 entities and 5 relationships
194
+ [person] Alice -- Lead engineer on the project
195
+ [technology] Python -- General-purpose programming language
196
+ [organization] Acme Corp -- Employer
197
+ [concept] Authentication -- Auth system design
198
+ Alice --[works_with]--> Python
199
+ Alice --[employed_by]--> Acme Corp
200
+ Alice --[proposed]--> Authentication
201
+ Bob --[collaborates_with]--> Alice
202
+ Authentication --[discussed_by]--> Alice
203
+```
204
+
205
+### /export FORMAT
206
+
207
+Request an export of the knowledge graph. Supported formats: `markdown`, `obsidian`, `notion`, `csv`. This command prints the equivalent CLI command to run.
208
+
209
+```
210
+planopticon> /export obsidian
211
+Export 'obsidian' requested. Use the CLI command:
212
+ planopticon export obsidian /home/user/project/results/knowledge_graph.db
213
+```
214
+
215
+### /analyze PATH
216
+
217
+Request analysis of a video or document file. Validates the file exists and prints the equivalent CLI command.
218
+
219
+```
220
+planopticon> /analyze meeting.mp4
221
+Analyze requested for meeting.mp4. Use the CLI:
222
+ planopticon analyze -i /home/user/project/meeting.mp4
223
+```
224
+
225
+### /ingest PATH
226
+
227
+Request ingestion of a file into the knowledge graph. Validates the file exists and prints the equivalent CLI command.
228
+
229
+```
230
+planopticon> /ingest notes.md
231
+Ingest requested for notes.md. Use the CLI:
232
+ planopticon ingest /home/user/project/notes.md
233
+```
234
+
235
+### /auth [SERVICE]
236
+
237
+Authenticate with a cloud service. When called without arguments, lists all available services. When called with a service name, triggers the authentication flow.
238
+
239
+```
240
+planopticon> /auth
241
+Usage: /auth SERVICE
242
+Available: dropbox, github, google, microsoft, notion, zoom
243
+
244
+planopticon> /auth zoom
245
+Zoom authenticated (oauth)
246
+```
247
+
248
+### /provider [NAME]
249
+
250
+List available LLM providers and their status, or switch to a different provider.
251
+
252
+When called without arguments (or with `list`), shows all known providers with their availability status:
253
+
254
+- **ready** -- API key found in environment
255
+- **local** -- runs locally (Ollama)
256
+- **no key** -- no API key configured
257
+
258
+The currently active provider is marked.
259
+
260
+```
261
+planopticon> /provider
262
+Available providers:
263
+ openai: ready (active)
264
+ anthropic: ready
265
+ gemini: no key
266
+ ollama: local
267
+ azure: no key
268
+ together: no key
269
+ fireworks: no key
270
+ cerebras: no key
271
+ xai: no key
272
+
273
+Current: openai
274
+```
275
+
276
+To switch providers at runtime:
277
+
278
+```
279
+planopticon> /provider anthropic
280
+Switched to provider: anthropic
281
+```
282
+
283
+Switching the provider reinitialises the provider manager and the planning agent. The chat model is reset to the provider's default. If initialisation fails, an error message is shown.
284
+
285
+### /model [NAME]
286
+
287
+Show the current chat model, or switch to a different one.
288
+
289
+```
290
+planopticon> /model
291
+Current model: default
292
+Usage: /model MODEL_NAME
293
+
294
+planopticon> /model claude-sonnet-4-20250514
295
+Switched to model: claude-sonnet-4-20250514
296
+```
297
+
298
+Switching the model reinitialises both the provider manager and the planning agent.
299
+
300
+### /run SKILL
301
+
302
+Run any registered skill by name. The skill receives the current agent context (knowledge graph, query engine, provider, and any previously generated artifacts) and returns an artifact.
303
+
304
+```
305
+planopticon> /run roadmap
306
+--- Roadmap (roadmap) ---
307
+# Roadmap
308
+
309
+## Vision & Strategy
310
+...
311
+```
312
+
313
+If the skill cannot execute (missing KG or provider), an error message is returned. Use `/skills` to see all available skill names.
314
+
315
+### /plan
316
+
317
+Shortcut for `/run project_plan`. Generates a structured project plan from the loaded knowledge graph.
318
+
319
+```
320
+planopticon> /plan
321
+--- Project Plan (project_plan) ---
322
+# Project Plan
323
+
324
+## Executive Summary
325
+...
326
+```
327
+
328
+### /prd
329
+
330
+Shortcut for `/run prd`. Generates a product requirements document.
331
+
332
+```
333
+planopticon> /prd
334
+--- Product Requirements Document (prd) ---
335
+# Product Requirements Document
336
+
337
+## Problem Statement
338
+...
339
+```
340
+
341
+### /tasks
342
+
343
+Shortcut for `/run task_breakdown`. Breaks goals and features into tasks with dependencies, priorities, and effort estimates. The output is JSON.
344
+
345
+```
346
+planopticon> /tasks
347
+--- Task Breakdown (task_list) ---
348
+[
349
+ {
350
+ "id": "T1",
351
+ "title": "Set up authentication service",
352
+ "description": "Implement OAuth2 flow with JWT tokens",
353
+ "depends_on": [],
354
+ "priority": "high",
355
+ "estimate": "1w",
356
+ "assignee_role": "backend engineer"
357
+ },
358
+ ...
359
+]
360
+```
361
+
362
+### /quit and /exit
363
+
364
+Exit the Companion REPL.
365
+
366
+```
367
+planopticon> /quit
368
+Bye.
369
+```
370
+
371
+---
372
+
373
+## Exiting the Companion
374
+
375
+In addition to `/quit` and `/exit`, you can exit by:
376
+
377
+- Typing `quit`, `exit`, `bye`, or `q` as bare words (without the `/` prefix)
378
+- Pressing `Ctrl+C` or `Ctrl+D`
379
+
380
+All of these end the session with a "Bye." message.
381
+
382
+---
383
+
384
+## Chat Mode
385
+
386
+Any input that does not start with `/` and is not a bare exit word is sent to the chat agent as a natural-language message. This requires a configured LLM provider.
387
+
388
+```
389
+planopticon> What technologies were discussed in the meeting?
390
+Based on the knowledge graph, the following technologies were discussed:
391
+
392
+1. **Python** -- mentioned in the context of backend development
393
+2. **React** -- proposed for the frontend redesign
394
+3. **PostgreSQL** -- discussed as the primary database
395
+...
396
+```
397
+
398
+The chat agent maintains conversation history across the session. It has full awareness of:
399
+
400
+- The loaded knowledge graph (entity and relationship counts, types)
401
+- Any artifacts generated during the session (via `/plan`, `/prd`, `/tasks`, `/run`)
402
+- All available slash commands (which it may suggest when relevant)
403
+- The full PlanOpticon CLI command set
404
+
405
+If no LLM provider is configured, chat mode returns an error with instructions:
406
+
407
+```
408
+planopticon> What was discussed?
409
+Chat requires an LLM provider. Set one of:
410
+ OPENAI_API_KEY
411
+ ANTHROPIC_API_KEY
412
+ GEMINI_API_KEY
413
+Or pass --provider / --chat-model.
414
+```
415
+
416
+---
417
+
418
+## Runtime Provider and Model Switching
419
+
420
+One of the Companion's key features is the ability to switch LLM providers and models without restarting the session. This is useful for:
421
+
422
+- Comparing outputs across different models
423
+- Falling back to a local model (Ollama) when API keys expire
424
+- Using a cheaper model for exploratory queries and a more capable one for artifact generation
425
+
426
+When you switch providers or models via `/provider` or `/model`, the Companion:
427
+
428
+1. Updates the internal provider name and/or model name
429
+2. Reinitialises the `ProviderManager`
430
+3. Reinitialises the `PlanningAgent` with a fresh `AgentContext` that retains the loaded knowledge graph and query engine
431
+
432
+Conversation history is preserved across provider switches.
433
+
434
+---
435
+
436
+## Example Session
437
+
438
+The following walkthrough shows a typical Companion session, from launch through exploration to artifact generation.
439
+
440
+```bash
441
+$ planopticon companion --kb ./results
442
+```
443
+
444
+```
445
+ PlanOpticon Companion
446
+ Interactive planning REPL
447
+
448
+ Knowledge graph: knowledge_graph.db (58 entities, 124 relationships)
449
+ Videos: sprint-review-2024-03.mp4
450
+ Docs: architecture.md, requirements.pdf
451
+ LLM provider: openai (model: default)
452
+
453
+ Type /help for commands, or ask a question.
454
+
455
+planopticon> /status
456
+Workspace status:
457
+ KG: /home/user/project/results/knowledge_graph.db (58 entities, 124 relationships)
458
+ technology: 20
459
+ person: 15
460
+ concept: 13
461
+ organization: 8
462
+ time: 2
463
+ Videos: 1 found
464
+ Docs: 2 found
465
+ Provider: active
466
+
467
+planopticon> /entities --type person
468
+Found 15 entities
469
+ [person] Alice -- Lead architect
470
+ [person] Bob -- Product manager
471
+ [person] Carol -- Frontend lead
472
+ ...
473
+
474
+planopticon> /neighbors Alice
475
+Found 6 entities and 8 relationships
476
+ [person] Alice -- Lead architect
477
+ [technology] Kubernetes -- Container orchestration platform
478
+ [concept] Microservices -- Proposed architecture pattern
479
+ ...
480
+ Alice --[proposed]--> Microservices
481
+ Alice --[expert_in]--> Kubernetes
482
+ ...
483
+
484
+planopticon> What were the main decisions made in the sprint review?
485
+Based on the knowledge graph, the sprint review covered several key decisions:
486
+
487
+1. **Adopt microservices architecture** -- Alice proposed and the team agreed
488
+ to move from the monolith to a microservices pattern.
489
+2. **Use Kubernetes for orchestration** -- Selected over Docker Swarm.
490
+3. **Prioritize authentication module** -- Bob identified this as the highest
491
+ priority for the next sprint.
492
+
493
+planopticon> /provider anthropic
494
+Switched to provider: anthropic
495
+
496
+planopticon> /model claude-sonnet-4-20250514
497
+Switched to model: claude-sonnet-4-20250514
498
+
499
+planopticon> /plan
500
+--- Project Plan (project_plan) ---
501
+# Project Plan
502
+
503
+## Executive Summary
504
+This project plan outlines the migration from a monolithic architecture
505
+to a microservices-based system, as discussed in the sprint review...
506
+
507
+## Goals & Objectives
508
+...
509
+
510
+planopticon> /tasks
511
+--- Task Breakdown (task_list) ---
512
+[
513
+ {
514
+ "id": "T1",
515
+ "title": "Design service boundaries",
516
+ "description": "Define microservice boundaries based on domain analysis",
517
+ "depends_on": [],
518
+ "priority": "high",
519
+ "estimate": "3d",
520
+ "assignee_role": "architect"
521
+ },
522
+ ...
523
+]
524
+
525
+planopticon> /export obsidian
526
+Export 'obsidian' requested. Use the CLI command:
527
+ planopticon export obsidian /home/user/project/results/knowledge_graph.db
528
+
529
+planopticon> quit
530
+Bye.
531
+```
--- a/docs/guide/companion.md
+++ b/docs/guide/companion.md
@@ -0,0 +1,531 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/guide/companion.md
+++ b/docs/guide/companion.md
@@ -0,0 +1,531 @@
1 # Interactive Companion REPL
2
3 The PlanOpticon Companion is an interactive Read-Eval-Print Loop (REPL) that provides a conversational interface to PlanOpticon's full feature set. It combines workspace awareness, knowledge graph querying, LLM-powered chat, and planning agent skills into a single session.
4
5 Use the Companion when you want to explore a knowledge graph interactively, ask natural-language questions about extracted content, generate planning artifacts on the fly, or switch between providers and models without restarting.
6
7 ---
8
9 ## Launching the Companion
10
11 There are three equivalent ways to start the Companion.
12
13 ### As a subcommand
14
15 ```bash
16 planopticon companion
17 ```
18
19 ### With the `--chat` / `-C` flag
20
21 ```bash
22 planopticon --chat
23 planopticon -C
24 ```
25
26 These flags launch the Companion directly from the top-level CLI, without invoking a subcommand.
27
28 ### With options
29
30 The `companion` subcommand accepts options for specifying knowledge base paths, LLM provider, and model:
31
32 ```bash
33 # Point at a specific knowledge base
34 planopticon companion --kb ./results
35
36 # Use a specific provider
37 planopticon companion -p anthropic
38
39 # Use a specific model
40 planopticon companion --chat-model gpt-4o
41
42 # Combine options
43 planopticon companion --kb ./results -p openai --chat-model gpt-4o
44 ```
45
46 | Option | Description |
47 |---|---|
48 | `--kb PATH` | Path to a knowledge graph file or directory (repeatable) |
49 | `-p, --provider NAME` | LLM provider: `auto`, `openai`, `anthropic`, `gemini`, `ollama`, `azure`, `together`, `fireworks`, `cerebras`, `xai` |
50 | `--chat-model NAME` | Override the default chat model for the selected provider |
51
52 ---
53
54 ## Auto-discovery
55
56 On startup, the Companion automatically scans the workspace for relevant files:
57
58 **Knowledge graphs.** The Companion uses `find_nearest_graph()` to locate the closest `knowledge_graph.db` or `knowledge_graph.json` file. It searches the current directory, common output subdirectories (`results/`, `output/`, `knowledge-base/`), recursively downward (up to 4 levels), and upward through parent directories. SQLite `.db` files are preferred over `.json` files.
59
60 **Videos.** The current directory is scanned for files with `.mp4`, `.mkv`, and `.webm` extensions.
61
62 **Documents.** The current directory is scanned for files with `.md`, `.pdf`, and `.docx` extensions.
63
64 **LLM provider.** If `--provider` is set to `auto` (the default), the Companion attempts to initialise a provider using any available API key in the environment (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, etc.).
65
66 All discovered context is displayed in the welcome banner:
67
68 ```
69 PlanOpticon Companion
70 Interactive planning REPL
71
72 Knowledge graph: knowledge_graph.db (42 entities, 87 relationships)
73 Videos: meeting-2024-01-15.mp4, sprint-review.mp4
74 Docs: requirements.md, architecture.pdf
75 LLM provider: openai (model: gpt-4o)
76
77 Type /help for commands, or ask a question.
78 ```
79
80 If no knowledge graph is found, the banner shows "No knowledge graph loaded." Commands that require a KG will return an appropriate message rather than failing silently.
81
82 ---
83
84 ## Slash Commands
85
86 The Companion supports 18 slash commands. Type `/help` at the prompt to see the full list.
87
88 ### /help
89
90 Display all available commands with brief descriptions.
91
92 ```
93 planopticon> /help
94 Available commands:
95 /help Show this help
96 /status Workspace status
97 /skills List available skills
98 /entities [--type T] List KG entities
99 /search TERM Search entities by name
100 /neighbors ENTITY Show entity relationships
101 /export FORMAT Export KG (markdown, obsidian, notion, csv)
102 /analyze PATH Analyze a video/doc
103 /ingest PATH Ingest a file into the KG
104 /auth SERVICE Authenticate with a cloud service
105 /provider [NAME] List or switch LLM provider
106 /model [NAME] Show or switch chat model
107 /run SKILL Run a skill by name
108 /plan Run project_plan skill
109 /prd Run PRD skill
110 /tasks Run task_breakdown skill
111 /quit, /exit Exit companion
112
113 Any other input is sent to the chat agent (requires LLM).
114 ```
115
116 ### /status
117
118 Show a summary of the current workspace state: loaded knowledge graph (with entity and relationship counts, broken down by entity type), number of discovered videos and documents, and whether an LLM provider is active.
119
120 ```
121 planopticon> /status
122 Workspace status:
123 KG: /home/user/project/results/knowledge_graph.db (42 entities, 87 relationships)
124 technology: 15
125 person: 12
126 concept: 10
127 organization: 5
128 Videos: 2 found
129 Docs: 3 found
130 Provider: active
131 ```
132
133 ### /skills
134
135 List all registered planning agent skills with their names and descriptions. These are the skills that can be invoked via `/run`.
136
137 ```
138 planopticon> /skills
139 Available skills:
140 project_plan: Generate a structured project plan from knowledge graph
141 prd: Generate a product requirements document (PRD) / feature spec
142 roadmap: Generate a product/project roadmap
143 task_breakdown: Break down goals into tasks with dependencies
144 github_issues: Generate GitHub issues from task breakdown
145 requirements_chat: Interactive requirements gathering via guided questions
146 doc_generator: Generate technical documentation, ADRs, or meeting notes
147 artifact_export: Export artifacts in agent-ready formats
148 cli_adapter: Push artifacts to external tools via their CLIs
149 notes_export: Export knowledge graph as structured notes (Obsidian, Notion)
150 wiki_generator: Generate a GitHub wiki from knowledge graph and artifacts
151 ```
152
153 ### /entities [--type TYPE]
154
155 List entities from the loaded knowledge graph. Optionally filter by entity type.
156
157 ```
158 planopticon> /entities
159 Found 42 entities
160 [technology] Python -- General-purpose programming language
161 [person] Alice -- Lead engineer on the project
162 [concept] Microservices -- Architectural pattern discussed
163 ...
164
165 planopticon> /entities --type person
166 Found 12 entities
167 [person] Alice -- Lead engineer on the project
168 [person] Bob -- Product manager
169 ...
170 ```
171
172 !!! note
173 This command requires a loaded knowledge graph. If none is loaded, it returns "No knowledge graph loaded."
174
175 ### /search TERM
176
177 Search entities by name substring (case-insensitive).
178
179 ```
180 planopticon> /search python
181 Found 3 entities
182 [technology] Python -- General-purpose programming language
183 [technology] Python Flask -- Web framework for Python
184 [concept] Python packaging -- Discussion of pip and packaging tools
185 ```
186
187 ### /neighbors ENTITY
188
189 Show all entities and relationships connected to a given entity. This performs a breadth-first traversal (depth 1) from the named entity.
190
191 ```
192 planopticon> /neighbors Alice
193 Found 4 entities and 5 relationships
194 [person] Alice -- Lead engineer on the project
195 [technology] Python -- General-purpose programming language
196 [organization] Acme Corp -- Employer
197 [concept] Authentication -- Auth system design
198 Alice --[works_with]--> Python
199 Alice --[employed_by]--> Acme Corp
200 Alice --[proposed]--> Authentication
201 Bob --[collaborates_with]--> Alice
202 Authentication --[discussed_by]--> Alice
203 ```
204
205 ### /export FORMAT
206
207 Request an export of the knowledge graph. Supported formats: `markdown`, `obsidian`, `notion`, `csv`. This command prints the equivalent CLI command to run.
208
209 ```
210 planopticon> /export obsidian
211 Export 'obsidian' requested. Use the CLI command:
212 planopticon export obsidian /home/user/project/results/knowledge_graph.db
213 ```
214
215 ### /analyze PATH
216
217 Request analysis of a video or document file. Validates the file exists and prints the equivalent CLI command.
218
219 ```
220 planopticon> /analyze meeting.mp4
221 Analyze requested for meeting.mp4. Use the CLI:
222 planopticon analyze -i /home/user/project/meeting.mp4
223 ```
224
225 ### /ingest PATH
226
227 Request ingestion of a file into the knowledge graph. Validates the file exists and prints the equivalent CLI command.
228
229 ```
230 planopticon> /ingest notes.md
231 Ingest requested for notes.md. Use the CLI:
232 planopticon ingest /home/user/project/notes.md
233 ```
234
235 ### /auth [SERVICE]
236
237 Authenticate with a cloud service. When called without arguments, lists all available services. When called with a service name, triggers the authentication flow.
238
239 ```
240 planopticon> /auth
241 Usage: /auth SERVICE
242 Available: dropbox, github, google, microsoft, notion, zoom
243
244 planopticon> /auth zoom
245 Zoom authenticated (oauth)
246 ```
247
248 ### /provider [NAME]
249
250 List available LLM providers and their status, or switch to a different provider.
251
252 When called without arguments (or with `list`), shows all known providers with their availability status:
253
254 - **ready** -- API key found in environment
255 - **local** -- runs locally (Ollama)
256 - **no key** -- no API key configured
257
258 The currently active provider is marked.
259
260 ```
261 planopticon> /provider
262 Available providers:
263 openai: ready (active)
264 anthropic: ready
265 gemini: no key
266 ollama: local
267 azure: no key
268 together: no key
269 fireworks: no key
270 cerebras: no key
271 xai: no key
272
273 Current: openai
274 ```
275
276 To switch providers at runtime:
277
278 ```
279 planopticon> /provider anthropic
280 Switched to provider: anthropic
281 ```
282
283 Switching the provider reinitialises the provider manager and the planning agent. The chat model is reset to the provider's default. If initialisation fails, an error message is shown.
284
285 ### /model [NAME]
286
287 Show the current chat model, or switch to a different one.
288
289 ```
290 planopticon> /model
291 Current model: default
292 Usage: /model MODEL_NAME
293
294 planopticon> /model claude-sonnet-4-20250514
295 Switched to model: claude-sonnet-4-20250514
296 ```
297
298 Switching the model reinitialises both the provider manager and the planning agent.
299
300 ### /run SKILL
301
302 Run any registered skill by name. The skill receives the current agent context (knowledge graph, query engine, provider, and any previously generated artifacts) and returns an artifact.
303
304 ```
305 planopticon> /run roadmap
306 --- Roadmap (roadmap) ---
307 # Roadmap
308
309 ## Vision & Strategy
310 ...
311 ```
312
313 If the skill cannot execute (missing KG or provider), an error message is returned. Use `/skills` to see all available skill names.
314
315 ### /plan
316
317 Shortcut for `/run project_plan`. Generates a structured project plan from the loaded knowledge graph.
318
319 ```
320 planopticon> /plan
321 --- Project Plan (project_plan) ---
322 # Project Plan
323
324 ## Executive Summary
325 ...
326 ```
327
328 ### /prd
329
330 Shortcut for `/run prd`. Generates a product requirements document.
331
332 ```
333 planopticon> /prd
334 --- Product Requirements Document (prd) ---
335 # Product Requirements Document
336
337 ## Problem Statement
338 ...
339 ```
340
341 ### /tasks
342
343 Shortcut for `/run task_breakdown`. Breaks goals and features into tasks with dependencies, priorities, and effort estimates. The output is JSON.
344
345 ```
346 planopticon> /tasks
347 --- Task Breakdown (task_list) ---
348 [
349 {
350 "id": "T1",
351 "title": "Set up authentication service",
352 "description": "Implement OAuth2 flow with JWT tokens",
353 "depends_on": [],
354 "priority": "high",
355 "estimate": "1w",
356 "assignee_role": "backend engineer"
357 },
358 ...
359 ]
360 ```
361
362 ### /quit and /exit
363
364 Exit the Companion REPL.
365
366 ```
367 planopticon> /quit
368 Bye.
369 ```
370
371 ---
372
373 ## Exiting the Companion
374
375 In addition to `/quit` and `/exit`, you can exit by:
376
377 - Typing `quit`, `exit`, `bye`, or `q` as bare words (without the `/` prefix)
378 - Pressing `Ctrl+C` or `Ctrl+D`
379
380 All of these end the session with a "Bye." message.
381
382 ---
383
384 ## Chat Mode
385
386 Any input that does not start with `/` and is not a bare exit word is sent to the chat agent as a natural-language message. This requires a configured LLM provider.
387
388 ```
389 planopticon> What technologies were discussed in the meeting?
390 Based on the knowledge graph, the following technologies were discussed:
391
392 1. **Python** -- mentioned in the context of backend development
393 2. **React** -- proposed for the frontend redesign
394 3. **PostgreSQL** -- discussed as the primary database
395 ...
396 ```
397
398 The chat agent maintains conversation history across the session. It has full awareness of:
399
400 - The loaded knowledge graph (entity and relationship counts, types)
401 - Any artifacts generated during the session (via `/plan`, `/prd`, `/tasks`, `/run`)
402 - All available slash commands (which it may suggest when relevant)
403 - The full PlanOpticon CLI command set
404
405 If no LLM provider is configured, chat mode returns an error with instructions:
406
407 ```
408 planopticon> What was discussed?
409 Chat requires an LLM provider. Set one of:
410 OPENAI_API_KEY
411 ANTHROPIC_API_KEY
412 GEMINI_API_KEY
413 Or pass --provider / --chat-model.
414 ```
415
416 ---
417
418 ## Runtime Provider and Model Switching
419
420 One of the Companion's key features is the ability to switch LLM providers and models without restarting the session. This is useful for:
421
422 - Comparing outputs across different models
423 - Falling back to a local model (Ollama) when API keys expire
424 - Using a cheaper model for exploratory queries and a more capable one for artifact generation
425
426 When you switch providers or models via `/provider` or `/model`, the Companion:
427
428 1. Updates the internal provider name and/or model name
429 2. Reinitialises the `ProviderManager`
430 3. Reinitialises the `PlanningAgent` with a fresh `AgentContext` that retains the loaded knowledge graph and query engine
431
432 Conversation history is preserved across provider switches.
433
434 ---
435
436 ## Example Session
437
438 The following walkthrough shows a typical Companion session, from launch through exploration to artifact generation.
439
440 ```bash
441 $ planopticon companion --kb ./results
442 ```
443
444 ```
445 PlanOpticon Companion
446 Interactive planning REPL
447
448 Knowledge graph: knowledge_graph.db (58 entities, 124 relationships)
449 Videos: sprint-review-2024-03.mp4
450 Docs: architecture.md, requirements.pdf
451 LLM provider: openai (model: default)
452
453 Type /help for commands, or ask a question.
454
455 planopticon> /status
456 Workspace status:
457 KG: /home/user/project/results/knowledge_graph.db (58 entities, 124 relationships)
458 technology: 20
459 person: 15
460 concept: 13
461 organization: 8
462 time: 2
463 Videos: 1 found
464 Docs: 2 found
465 Provider: active
466
467 planopticon> /entities --type person
468 Found 15 entities
469 [person] Alice -- Lead architect
470 [person] Bob -- Product manager
471 [person] Carol -- Frontend lead
472 ...
473
474 planopticon> /neighbors Alice
475 Found 6 entities and 8 relationships
476 [person] Alice -- Lead architect
477 [technology] Kubernetes -- Container orchestration platform
478 [concept] Microservices -- Proposed architecture pattern
479 ...
480 Alice --[proposed]--> Microservices
481 Alice --[expert_in]--> Kubernetes
482 ...
483
484 planopticon> What were the main decisions made in the sprint review?
485 Based on the knowledge graph, the sprint review covered several key decisions:
486
487 1. **Adopt microservices architecture** -- Alice proposed and the team agreed
488 to move from the monolith to a microservices pattern.
489 2. **Use Kubernetes for orchestration** -- Selected over Docker Swarm.
490 3. **Prioritize authentication module** -- Bob identified this as the highest
491 priority for the next sprint.
492
493 planopticon> /provider anthropic
494 Switched to provider: anthropic
495
496 planopticon> /model claude-sonnet-4-20250514
497 Switched to model: claude-sonnet-4-20250514
498
499 planopticon> /plan
500 --- Project Plan (project_plan) ---
501 # Project Plan
502
503 ## Executive Summary
504 This project plan outlines the migration from a monolithic architecture
505 to a microservices-based system, as discussed in the sprint review...
506
507 ## Goals & Objectives
508 ...
509
510 planopticon> /tasks
511 --- Task Breakdown (task_list) ---
512 [
513 {
514 "id": "T1",
515 "title": "Design service boundaries",
516 "description": "Define microservice boundaries based on domain analysis",
517 "depends_on": [],
518 "priority": "high",
519 "estimate": "3d",
520 "assignee_role": "architect"
521 },
522 ...
523 ]
524
525 planopticon> /export obsidian
526 Export 'obsidian' requested. Use the CLI command:
527 planopticon export obsidian /home/user/project/results/knowledge_graph.db
528
529 planopticon> quit
530 Bye.
531 ```
--- a/docs/guide/document-ingestion.md
+++ b/docs/guide/document-ingestion.md
@@ -0,0 +1,434 @@
1
+# Document Ingestion
2
+
3
+Document ingestion lets you process files -- PDFs, Markdown, and plaintext -- into a knowledge graph. PlanOpticon extracts text from documents, chunks it into manageable pieces, runs LLM-powered entity and relationship extraction, and stores the results in a FalkorDB knowledge graph. This is the same knowledge graph format produced by video analysis, so you can combine video and document insights in a single graph.
4
+
5
+## Supported formats
6
+
7
+| Extension | Processor | Description |
8
+|-----------|-----------|-------------|
9
+| `.pdf` | `PdfProcessor` | Extracts text page by page using pymupdf or pdfplumber |
10
+| `.md`, `.markdown` | `MarkdownProcessor` | Splits on headings into sections |
11
+| `.txt`, `.text`, `.log`, `.csv` | `PlaintextProcessor` | Splits on paragraph boundaries |
12
+
13
+Additional formats can be added by implementing the `DocumentProcessor` base class and registering it (see [Extending with custom processors](#extending-with-custom-processors) below).
14
+
15
+## CLI usage
16
+
17
+### `planopticon ingest`
18
+
19
+```
20
+planopticon ingest INPUT_PATH [OPTIONS]
21
+```
22
+
23
+**Arguments:**
24
+
25
+| Argument | Description |
26
+|----------|-------------|
27
+| `INPUT_PATH` | Path to a file or directory to ingest (must exist) |
28
+
29
+**Options:**
30
+
31
+| Option | Short | Default | Description |
32
+|--------|-------|---------|-------------|
33
+| `--output` | `-o` | Current directory | Output directory for the knowledge graph |
34
+| `--db-path` | | None | Path to an existing `knowledge_graph.db` to merge into |
35
+| `--recursive / --no-recursive` | `-r` | `--recursive` | Recurse into subdirectories (directory ingestion only) |
36
+| `--provider` | `-p` | `auto` | LLM provider for entity extraction (`openai`, `anthropic`, `gemini`, `ollama`, `azure`, `together`, `fireworks`, `cerebras`, `xai`) |
37
+| `--chat-model` | | None | Override the model used for LLM entity extraction |
38
+
39
+### Single file ingestion
40
+
41
+Process a single document and create a new knowledge graph:
42
+
43
+```bash
44
+planopticon ingest spec.md
45
+```
46
+
47
+This creates `knowledge_graph.db` and `knowledge_graph.json` in the current directory.
48
+
49
+Specify an output directory:
50
+
51
+```bash
52
+planopticon ingest report.pdf -o ./results
53
+```
54
+
55
+This creates `./results/knowledge_graph.db` and `./results/knowledge_graph.json`.
56
+
57
+### Directory ingestion
58
+
59
+Process all supported files in a directory:
60
+
61
+```bash
62
+planopticon ingest ./docs/
63
+```
64
+
65
+By default, this recurses into subdirectories. To process only the top-level directory:
66
+
67
+```bash
68
+planopticon ingest ./docs/ --no-recursive
69
+```
70
+
71
+PlanOpticon automatically filters for supported file extensions. Unsupported files are silently skipped.
72
+
73
+### Merging into an existing knowledge graph
74
+
75
+To add document content to an existing knowledge graph (e.g., one created from video analysis), use `--db-path`:
76
+
77
+```bash
78
+# First, analyze a video
79
+planopticon analyze meeting.mp4 -o ./results
80
+
81
+# Then, ingest supplementary documents into the same graph
82
+planopticon ingest ./meeting-notes/ --db-path ./results/knowledge_graph.db
83
+```
84
+
85
+The ingested entities and relationships are merged with the existing graph. Duplicate entities are consolidated automatically by the knowledge graph engine.
86
+
87
+### Choosing an LLM provider
88
+
89
+Entity and relationship extraction requires an LLM. By default, PlanOpticon auto-detects available providers based on your environment variables. You can override this:
90
+
91
+```bash
92
+# Use Anthropic for extraction
93
+planopticon ingest docs/ -p anthropic
94
+
95
+# Use a specific model
96
+planopticon ingest docs/ -p openai --chat-model gpt-4o
97
+
98
+# Use a local Ollama model
99
+planopticon ingest docs/ -p ollama --chat-model llama3
100
+```
101
+
102
+### Output
103
+
104
+After ingestion, PlanOpticon prints a summary:
105
+
106
+```
107
+Knowledge graph: ./knowledge_graph.db
108
+ spec.md: 12 chunks
109
+ architecture.md: 8 chunks
110
+ requirements.txt: 3 chunks
111
+
112
+Ingestion complete:
113
+ Files processed: 3
114
+ Total chunks: 23
115
+ Entities extracted: 47
116
+ Relationships: 31
117
+ Knowledge graph: ./knowledge_graph.db
118
+```
119
+
120
+Both `.db` (SQLite/FalkorDB) and `.json` formats are saved automatically.
121
+
122
+## How each processor works
123
+
124
+### PDF processor
125
+
126
+The `PdfProcessor` extracts text from PDF files on a per-page basis. It tries two extraction libraries in order:
127
+
128
+1. **pymupdf** (preferred) -- Fast, reliable text extraction. Install with `pip install pymupdf`.
129
+2. **pdfplumber** (fallback) -- Alternative extractor. Install with `pip install pdfplumber`.
130
+
131
+If neither library is installed, the processor raises an `ImportError` with installation instructions.
132
+
133
+Each page becomes a separate `DocumentChunk` with:
134
+
135
+- `text`: The extracted text content of the page
136
+- `page`: The 1-based page number
137
+- `metadata.extraction_method`: Which library was used (`pymupdf` or `pdfplumber`)
138
+
139
+To install PDF support:
140
+
141
+```bash
142
+pip install 'planopticon[pdf]'
143
+# or
144
+pip install pymupdf
145
+# or
146
+pip install pdfplumber
147
+```
148
+
149
+### Markdown processor
150
+
151
+The `MarkdownProcessor` splits Markdown files on heading boundaries (lines starting with `#` through `######`). Each heading and its content until the next heading becomes a separate chunk.
152
+
153
+**Splitting behavior:**
154
+
155
+- If the file contains headings, each heading section becomes a chunk. The `section` field records the heading text.
156
+- Content before the first heading is captured as a `(preamble)` chunk.
157
+- If the file contains no headings, it falls back to paragraph-based chunking (same as plaintext).
158
+
159
+For example, a file with this structure:
160
+
161
+```markdown
162
+Some intro text.
163
+
164
+# Architecture
165
+
166
+The system uses a microservices architecture...
167
+
168
+## Components
169
+
170
+There are three main components...
171
+
172
+# Deployment
173
+
174
+Deployment is handled via...
175
+```
176
+
177
+Produces four chunks: `(preamble)`, `Architecture`, `Components`, and `Deployment`.
178
+
179
+### Plaintext processor
180
+
181
+The `PlaintextProcessor` handles `.txt`, `.text`, `.log`, and `.csv` files. It splits text on paragraph boundaries (double newlines) and groups paragraphs into chunks with a configurable maximum size.
182
+
183
+**Chunking parameters:**
184
+
185
+| Parameter | Default | Description |
186
+|-----------|---------|-------------|
187
+| `max_chunk_size` | 2000 characters | Maximum size of each chunk |
188
+| `overlap` | 200 characters | Number of characters from the end of one chunk to repeat at the start of the next |
189
+
190
+The overlap ensures that entities or context that spans a paragraph boundary are not lost. Chunks are created by accumulating paragraphs until the next paragraph would exceed `max_chunk_size`, at which point the current chunk is flushed and a new one begins.
191
+
192
+## The ingestion pipeline
193
+
194
+Document ingestion follows this pipeline:
195
+
196
+```
197
+File on disk
198
+ |
199
+ v
200
+Processor selection (by file extension)
201
+ |
202
+ v
203
+Text extraction (PDF pages / Markdown sections / plaintext paragraphs)
204
+ |
205
+ v
206
+DocumentChunk objects (text + metadata)
207
+ |
208
+ v
209
+Source registration (provenance tracking in the KG)
210
+ |
211
+ v
212
+KG content addition (LLM entity/relationship extraction per chunk)
213
+ |
214
+ v
215
+Knowledge graph storage (.db + .json)
216
+```
217
+
218
+### Step 1: Processor selection
219
+
220
+PlanOpticon maintains a registry of processors keyed by file extension. When you call `ingest_file()`, it looks up the appropriate processor using `get_processor(path)`. If no processor is registered for the file extension, a `ValueError` is raised.
221
+
222
+### Step 2: Text extraction
223
+
224
+The selected processor reads the file and produces a list of `DocumentChunk` objects. Each chunk contains:
225
+
226
+| Field | Type | Description |
227
+|-------|------|-------------|
228
+| `text` | `str` | The extracted text content |
229
+| `source_file` | `str` | Path to the source file |
230
+| `chunk_index` | `int` | Sequential index of this chunk within the file |
231
+| `page` | `Optional[int]` | Page number (PDF only, 1-based) |
232
+| `section` | `Optional[str]` | Section heading (Markdown only) |
233
+| `metadata` | `Dict[str, Any]` | Additional metadata (e.g., extraction method) |
234
+
235
+### Step 3: Source registration
236
+
237
+Each ingested file is registered as a source in the knowledge graph with provenance metadata:
238
+
239
+- `source_id`: A SHA-256 hash of the absolute file path (first 12 characters), unless you provide a custom ID
240
+- `source_type`: Always `"document"`
241
+- `title`: The file stem (filename without extension)
242
+- `path`: The file path
243
+- `mime_type`: Detected MIME type
244
+- `ingested_at`: ISO-8601 timestamp
245
+- `metadata`: Chunk count and file extension
246
+
247
+### Step 4: Entity and relationship extraction
248
+
249
+Each chunk's text is passed to `knowledge_graph.add_content()`, which uses the configured LLM provider to extract entities and relationships. The content source is tagged with the document name and either the page number or section name:
250
+
251
+- `document:report.pdf:page:3`
252
+- `document:spec.md:section:Architecture`
253
+- `document:notes.txt` (no page or section)
254
+
255
+### Step 5: Storage
256
+
257
+The knowledge graph is saved in both `.db` (SQLite-backed FalkorDB) and `.json` formats.
258
+
259
+## Combining with video analysis
260
+
261
+A common workflow is to analyze a video recording and then ingest related documents into the same knowledge graph:
262
+
263
+```bash
264
+# Step 1: Analyze the meeting recording
265
+planopticon analyze meeting-recording.mp4 -o ./project-kg
266
+
267
+# Step 2: Ingest the meeting agenda
268
+planopticon ingest agenda.md --db-path ./project-kg/knowledge_graph.db
269
+
270
+# Step 3: Ingest the project spec
271
+planopticon ingest project-spec.pdf --db-path ./project-kg/knowledge_graph.db
272
+
273
+# Step 4: Ingest a whole docs folder
274
+planopticon ingest ./reference-docs/ --db-path ./project-kg/knowledge_graph.db
275
+
276
+# Step 5: Query the combined graph
277
+planopticon query --db-path ./project-kg/knowledge_graph.db
278
+```
279
+
280
+The resulting knowledge graph contains entities and relationships from all sources -- video transcripts, meeting agendas, specs, and reference documents -- with full provenance tracking so you can trace any entity back to its source.
281
+
282
+## Python API
283
+
284
+### Ingesting a single file
285
+
286
+```python
287
+from pathlib import Path
288
+from video_processor.integrators.knowledge_graph import KnowledgeGraph
289
+from video_processor.processors.ingest import ingest_file
290
+
291
+kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
292
+chunk_count = ingest_file(Path("document.pdf"), kg)
293
+print(f"Processed {chunk_count} chunks")
294
+
295
+kg.save(Path("knowledge_graph.db"))
296
+```
297
+
298
+### Ingesting a directory
299
+
300
+```python
301
+from pathlib import Path
302
+from video_processor.integrators.knowledge_graph import KnowledgeGraph
303
+from video_processor.processors.ingest import ingest_directory
304
+
305
+kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
306
+results = ingest_directory(
307
+ Path("./docs"),
308
+ kg,
309
+ recursive=True,
310
+ extensions=[".md", ".pdf"], # Optional: filter by extension
311
+)
312
+
313
+for filepath, chunks in results.items():
314
+ print(f" {filepath}: {chunks} chunks")
315
+
316
+kg.save(Path("knowledge_graph.db"))
317
+```
318
+
319
+### Listing supported extensions
320
+
321
+```python
322
+from video_processor.processors.base import list_supported_extensions
323
+
324
+extensions = list_supported_extensions()
325
+print(extensions)
326
+# ['.csv', '.log', '.markdown', '.md', '.pdf', '.text', '.txt']
327
+```
328
+
329
+### Working with processors directly
330
+
331
+```python
332
+from pathlib import Path
333
+from video_processor.processors.base import get_processor
334
+
335
+processor = get_processor(Path("report.pdf"))
336
+if processor:
337
+ chunks = processor.process(Path("report.pdf"))
338
+ for chunk in chunks:
339
+ print(f"Page {chunk.page}: {chunk.text[:100]}...")
340
+```
341
+
342
+## Extending with custom processors
343
+
344
+To add support for a new file format, implement the `DocumentProcessor` abstract class and register it:
345
+
346
+```python
347
+from pathlib import Path
348
+from typing import List
349
+from video_processor.processors.base import (
350
+ DocumentChunk,
351
+ DocumentProcessor,
352
+ register_processor,
353
+)
354
+
355
+
356
+class HtmlProcessor(DocumentProcessor):
357
+ supported_extensions = [".html", ".htm"]
358
+
359
+ def can_process(self, path: Path) -> bool:
360
+ return path.suffix.lower() in self.supported_extensions
361
+
362
+ def process(self, path: Path) -> List[DocumentChunk]:
363
+ from bs4 import BeautifulSoup
364
+
365
+ soup = BeautifulSoup(path.read_text(), "html.parser")
366
+ text = soup.get_text(separator="\n")
367
+ return [
368
+ DocumentChunk(
369
+ text=text,
370
+ source_file=str(path),
371
+ chunk_index=0,
372
+ )
373
+ ]
374
+
375
+
376
+register_processor(HtmlProcessor.supported_extensions, HtmlProcessor)
377
+```
378
+
379
+After registration, `planopticon ingest` will automatically handle `.html` and `.htm` files.
380
+
381
+## Companion REPL
382
+
383
+Inside the interactive companion REPL, you can ingest files using the `/ingest` command:
384
+
385
+```
386
+> /ingest ./meeting-notes.md
387
+Ingested meeting-notes.md: 5 chunks
388
+```
389
+
390
+This adds content to the currently loaded knowledge graph.
391
+
392
+## Common workflows
393
+
394
+### Build a project knowledge base from scratch
395
+
396
+```bash
397
+# Ingest all project docs
398
+planopticon ingest ./project-docs/ -o ./knowledge-base
399
+
400
+# Query what was captured
401
+planopticon query --db-path ./knowledge-base/knowledge_graph.db
402
+
403
+# Export as an Obsidian vault
404
+planopticon export obsidian ./knowledge-base/knowledge_graph.db -o ./vault
405
+```
406
+
407
+### Incrementally build a knowledge graph
408
+
409
+```bash
410
+# Start with initial docs
411
+planopticon ingest ./sprint-1-docs/ -o ./kg
412
+
413
+# Add more docs over time
414
+planopticon ingest ./sprint-2-docs/ --db-path ./kg/knowledge_graph.db
415
+planopticon ingest ./sprint-3-docs/ --db-path ./kg/knowledge_graph.db
416
+
417
+# The graph grows with each ingestion
418
+planopticon query --db-path ./kg/knowledge_graph.db stats
419
+```
420
+
421
+### Ingest from Google Workspace or Microsoft 365
422
+
423
+PlanOpticon provides integrated commands that fetch cloud documents and ingest them in one step:
424
+
425
+```bash
426
+# Google Workspace
427
+planopticon gws ingest --folder-id FOLDER_ID -o ./results
428
+
429
+# Microsoft 365 / SharePoint
430
+planopticon m365 ingest --web-url https://contoso.sharepoint.com/sites/proj \
431
+ --folder-url /sites/proj/Shared\ Documents
432
+```
433
+
434
+These commands handle authentication, document download, text extraction, and knowledge graph creation automatically.
--- a/docs/guide/document-ingestion.md
+++ b/docs/guide/document-ingestion.md
@@ -0,0 +1,434 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/guide/document-ingestion.md
+++ b/docs/guide/document-ingestion.md
@@ -0,0 +1,434 @@
1 # Document Ingestion
2
3 Document ingestion lets you process files -- PDFs, Markdown, and plaintext -- into a knowledge graph. PlanOpticon extracts text from documents, chunks it into manageable pieces, runs LLM-powered entity and relationship extraction, and stores the results in a FalkorDB knowledge graph. This is the same knowledge graph format produced by video analysis, so you can combine video and document insights in a single graph.
4
5 ## Supported formats
6
7 | Extension | Processor | Description |
8 |-----------|-----------|-------------|
9 | `.pdf` | `PdfProcessor` | Extracts text page by page using pymupdf or pdfplumber |
10 | `.md`, `.markdown` | `MarkdownProcessor` | Splits on headings into sections |
11 | `.txt`, `.text`, `.log`, `.csv` | `PlaintextProcessor` | Splits on paragraph boundaries |
12
13 Additional formats can be added by implementing the `DocumentProcessor` base class and registering it (see [Extending with custom processors](#extending-with-custom-processors) below).
14
15 ## CLI usage
16
17 ### `planopticon ingest`
18
19 ```
20 planopticon ingest INPUT_PATH [OPTIONS]
21 ```
22
23 **Arguments:**
24
25 | Argument | Description |
26 |----------|-------------|
27 | `INPUT_PATH` | Path to a file or directory to ingest (must exist) |
28
29 **Options:**
30
31 | Option | Short | Default | Description |
32 |--------|-------|---------|-------------|
33 | `--output` | `-o` | Current directory | Output directory for the knowledge graph |
34 | `--db-path` | | None | Path to an existing `knowledge_graph.db` to merge into |
35 | `--recursive / --no-recursive` | `-r` | `--recursive` | Recurse into subdirectories (directory ingestion only) |
36 | `--provider` | `-p` | `auto` | LLM provider for entity extraction (`openai`, `anthropic`, `gemini`, `ollama`, `azure`, `together`, `fireworks`, `cerebras`, `xai`) |
37 | `--chat-model` | | None | Override the model used for LLM entity extraction |
38
39 ### Single file ingestion
40
41 Process a single document and create a new knowledge graph:
42
43 ```bash
44 planopticon ingest spec.md
45 ```
46
47 This creates `knowledge_graph.db` and `knowledge_graph.json` in the current directory.
48
49 Specify an output directory:
50
51 ```bash
52 planopticon ingest report.pdf -o ./results
53 ```
54
55 This creates `./results/knowledge_graph.db` and `./results/knowledge_graph.json`.
56
57 ### Directory ingestion
58
59 Process all supported files in a directory:
60
61 ```bash
62 planopticon ingest ./docs/
63 ```
64
65 By default, this recurses into subdirectories. To process only the top-level directory:
66
67 ```bash
68 planopticon ingest ./docs/ --no-recursive
69 ```
70
71 PlanOpticon automatically filters for supported file extensions. Unsupported files are silently skipped.
72
73 ### Merging into an existing knowledge graph
74
75 To add document content to an existing knowledge graph (e.g., one created from video analysis), use `--db-path`:
76
77 ```bash
78 # First, analyze a video
79 planopticon analyze meeting.mp4 -o ./results
80
81 # Then, ingest supplementary documents into the same graph
82 planopticon ingest ./meeting-notes/ --db-path ./results/knowledge_graph.db
83 ```
84
85 The ingested entities and relationships are merged with the existing graph. Duplicate entities are consolidated automatically by the knowledge graph engine.
86
87 ### Choosing an LLM provider
88
89 Entity and relationship extraction requires an LLM. By default, PlanOpticon auto-detects available providers based on your environment variables. You can override this:
90
91 ```bash
92 # Use Anthropic for extraction
93 planopticon ingest docs/ -p anthropic
94
95 # Use a specific model
96 planopticon ingest docs/ -p openai --chat-model gpt-4o
97
98 # Use a local Ollama model
99 planopticon ingest docs/ -p ollama --chat-model llama3
100 ```
101
102 ### Output
103
104 After ingestion, PlanOpticon prints a summary:
105
106 ```
107 Knowledge graph: ./knowledge_graph.db
108 spec.md: 12 chunks
109 architecture.md: 8 chunks
110 requirements.txt: 3 chunks
111
112 Ingestion complete:
113 Files processed: 3
114 Total chunks: 23
115 Entities extracted: 47
116 Relationships: 31
117 Knowledge graph: ./knowledge_graph.db
118 ```
119
120 Both `.db` (SQLite/FalkorDB) and `.json` formats are saved automatically.
121
122 ## How each processor works
123
124 ### PDF processor
125
126 The `PdfProcessor` extracts text from PDF files on a per-page basis. It tries two extraction libraries in order:
127
128 1. **pymupdf** (preferred) -- Fast, reliable text extraction. Install with `pip install pymupdf`.
129 2. **pdfplumber** (fallback) -- Alternative extractor. Install with `pip install pdfplumber`.
130
131 If neither library is installed, the processor raises an `ImportError` with installation instructions.
132
133 Each page becomes a separate `DocumentChunk` with:
134
135 - `text`: The extracted text content of the page
136 - `page`: The 1-based page number
137 - `metadata.extraction_method`: Which library was used (`pymupdf` or `pdfplumber`)
138
139 To install PDF support:
140
141 ```bash
142 pip install 'planopticon[pdf]'
143 # or
144 pip install pymupdf
145 # or
146 pip install pdfplumber
147 ```
148
149 ### Markdown processor
150
151 The `MarkdownProcessor` splits Markdown files on heading boundaries (lines starting with `#` through `######`). Each heading and its content until the next heading becomes a separate chunk.
152
153 **Splitting behavior:**
154
155 - If the file contains headings, each heading section becomes a chunk. The `section` field records the heading text.
156 - Content before the first heading is captured as a `(preamble)` chunk.
157 - If the file contains no headings, it falls back to paragraph-based chunking (same as plaintext).
158
159 For example, a file with this structure:
160
161 ```markdown
162 Some intro text.
163
164 # Architecture
165
166 The system uses a microservices architecture...
167
168 ## Components
169
170 There are three main components...
171
172 # Deployment
173
174 Deployment is handled via...
175 ```
176
177 Produces four chunks: `(preamble)`, `Architecture`, `Components`, and `Deployment`.
178
179 ### Plaintext processor
180
181 The `PlaintextProcessor` handles `.txt`, `.text`, `.log`, and `.csv` files. It splits text on paragraph boundaries (double newlines) and groups paragraphs into chunks with a configurable maximum size.
182
183 **Chunking parameters:**
184
185 | Parameter | Default | Description |
186 |-----------|---------|-------------|
187 | `max_chunk_size` | 2000 characters | Maximum size of each chunk |
188 | `overlap` | 200 characters | Number of characters from the end of one chunk to repeat at the start of the next |
189
190 The overlap ensures that entities or context that spans a paragraph boundary are not lost. Chunks are created by accumulating paragraphs until the next paragraph would exceed `max_chunk_size`, at which point the current chunk is flushed and a new one begins.
191
192 ## The ingestion pipeline
193
194 Document ingestion follows this pipeline:
195
196 ```
197 File on disk
198 |
199 v
200 Processor selection (by file extension)
201 |
202 v
203 Text extraction (PDF pages / Markdown sections / plaintext paragraphs)
204 |
205 v
206 DocumentChunk objects (text + metadata)
207 |
208 v
209 Source registration (provenance tracking in the KG)
210 |
211 v
212 KG content addition (LLM entity/relationship extraction per chunk)
213 |
214 v
215 Knowledge graph storage (.db + .json)
216 ```
217
218 ### Step 1: Processor selection
219
220 PlanOpticon maintains a registry of processors keyed by file extension. When you call `ingest_file()`, it looks up the appropriate processor using `get_processor(path)`. If no processor is registered for the file extension, a `ValueError` is raised.
221
222 ### Step 2: Text extraction
223
224 The selected processor reads the file and produces a list of `DocumentChunk` objects. Each chunk contains:
225
226 | Field | Type | Description |
227 |-------|------|-------------|
228 | `text` | `str` | The extracted text content |
229 | `source_file` | `str` | Path to the source file |
230 | `chunk_index` | `int` | Sequential index of this chunk within the file |
231 | `page` | `Optional[int]` | Page number (PDF only, 1-based) |
232 | `section` | `Optional[str]` | Section heading (Markdown only) |
233 | `metadata` | `Dict[str, Any]` | Additional metadata (e.g., extraction method) |
234
235 ### Step 3: Source registration
236
237 Each ingested file is registered as a source in the knowledge graph with provenance metadata:
238
239 - `source_id`: A SHA-256 hash of the absolute file path (first 12 characters), unless you provide a custom ID
240 - `source_type`: Always `"document"`
241 - `title`: The file stem (filename without extension)
242 - `path`: The file path
243 - `mime_type`: Detected MIME type
244 - `ingested_at`: ISO-8601 timestamp
245 - `metadata`: Chunk count and file extension
246
247 ### Step 4: Entity and relationship extraction
248
249 Each chunk's text is passed to `knowledge_graph.add_content()`, which uses the configured LLM provider to extract entities and relationships. The content source is tagged with the document name and either the page number or section name:
250
251 - `document:report.pdf:page:3`
252 - `document:spec.md:section:Architecture`
253 - `document:notes.txt` (no page or section)
254
255 ### Step 5: Storage
256
257 The knowledge graph is saved in both `.db` (SQLite-backed FalkorDB) and `.json` formats.
258
259 ## Combining with video analysis
260
261 A common workflow is to analyze a video recording and then ingest related documents into the same knowledge graph:
262
263 ```bash
264 # Step 1: Analyze the meeting recording
265 planopticon analyze meeting-recording.mp4 -o ./project-kg
266
267 # Step 2: Ingest the meeting agenda
268 planopticon ingest agenda.md --db-path ./project-kg/knowledge_graph.db
269
270 # Step 3: Ingest the project spec
271 planopticon ingest project-spec.pdf --db-path ./project-kg/knowledge_graph.db
272
273 # Step 4: Ingest a whole docs folder
274 planopticon ingest ./reference-docs/ --db-path ./project-kg/knowledge_graph.db
275
276 # Step 5: Query the combined graph
277 planopticon query --db-path ./project-kg/knowledge_graph.db
278 ```
279
280 The resulting knowledge graph contains entities and relationships from all sources -- video transcripts, meeting agendas, specs, and reference documents -- with full provenance tracking so you can trace any entity back to its source.
281
282 ## Python API
283
284 ### Ingesting a single file
285
286 ```python
287 from pathlib import Path
288 from video_processor.integrators.knowledge_graph import KnowledgeGraph
289 from video_processor.processors.ingest import ingest_file
290
291 kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
292 chunk_count = ingest_file(Path("document.pdf"), kg)
293 print(f"Processed {chunk_count} chunks")
294
295 kg.save(Path("knowledge_graph.db"))
296 ```
297
298 ### Ingesting a directory
299
300 ```python
301 from pathlib import Path
302 from video_processor.integrators.knowledge_graph import KnowledgeGraph
303 from video_processor.processors.ingest import ingest_directory
304
305 kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
306 results = ingest_directory(
307 Path("./docs"),
308 kg,
309 recursive=True,
310 extensions=[".md", ".pdf"], # Optional: filter by extension
311 )
312
313 for filepath, chunks in results.items():
314 print(f" {filepath}: {chunks} chunks")
315
316 kg.save(Path("knowledge_graph.db"))
317 ```
318
319 ### Listing supported extensions
320
321 ```python
322 from video_processor.processors.base import list_supported_extensions
323
324 extensions = list_supported_extensions()
325 print(extensions)
326 # ['.csv', '.log', '.markdown', '.md', '.pdf', '.text', '.txt']
327 ```
328
329 ### Working with processors directly
330
331 ```python
332 from pathlib import Path
333 from video_processor.processors.base import get_processor
334
335 processor = get_processor(Path("report.pdf"))
336 if processor:
337 chunks = processor.process(Path("report.pdf"))
338 for chunk in chunks:
339 print(f"Page {chunk.page}: {chunk.text[:100]}...")
340 ```
341
342 ## Extending with custom processors
343
344 To add support for a new file format, implement the `DocumentProcessor` abstract class and register it:
345
346 ```python
347 from pathlib import Path
348 from typing import List
349 from video_processor.processors.base import (
350 DocumentChunk,
351 DocumentProcessor,
352 register_processor,
353 )
354
355
356 class HtmlProcessor(DocumentProcessor):
357 supported_extensions = [".html", ".htm"]
358
359 def can_process(self, path: Path) -> bool:
360 return path.suffix.lower() in self.supported_extensions
361
362 def process(self, path: Path) -> List[DocumentChunk]:
363 from bs4 import BeautifulSoup
364
365 soup = BeautifulSoup(path.read_text(), "html.parser")
366 text = soup.get_text(separator="\n")
367 return [
368 DocumentChunk(
369 text=text,
370 source_file=str(path),
371 chunk_index=0,
372 )
373 ]
374
375
376 register_processor(HtmlProcessor.supported_extensions, HtmlProcessor)
377 ```
378
379 After registration, `planopticon ingest` will automatically handle `.html` and `.htm` files.
380
381 ## Companion REPL
382
383 Inside the interactive companion REPL, you can ingest files using the `/ingest` command:
384
385 ```
386 > /ingest ./meeting-notes.md
387 Ingested meeting-notes.md: 5 chunks
388 ```
389
390 This adds content to the currently loaded knowledge graph.
391
392 ## Common workflows
393
394 ### Build a project knowledge base from scratch
395
396 ```bash
397 # Ingest all project docs
398 planopticon ingest ./project-docs/ -o ./knowledge-base
399
400 # Query what was captured
401 planopticon query --db-path ./knowledge-base/knowledge_graph.db
402
403 # Export as an Obsidian vault
404 planopticon export obsidian ./knowledge-base/knowledge_graph.db -o ./vault
405 ```
406
407 ### Incrementally build a knowledge graph
408
409 ```bash
410 # Start with initial docs
411 planopticon ingest ./sprint-1-docs/ -o ./kg
412
413 # Add more docs over time
414 planopticon ingest ./sprint-2-docs/ --db-path ./kg/knowledge_graph.db
415 planopticon ingest ./sprint-3-docs/ --db-path ./kg/knowledge_graph.db
416
417 # The graph grows with each ingestion
418 planopticon query --db-path ./kg/knowledge_graph.db stats
419 ```
420
421 ### Ingest from Google Workspace or Microsoft 365
422
423 PlanOpticon provides integrated commands that fetch cloud documents and ingest them in one step:
424
425 ```bash
426 # Google Workspace
427 planopticon gws ingest --folder-id FOLDER_ID -o ./results
428
429 # Microsoft 365 / SharePoint
430 planopticon m365 ingest --web-url https://contoso.sharepoint.com/sites/proj \
431 --folder-url /sites/proj/Shared\ Documents
432 ```
433
434 These commands handle authentication, document download, text extraction, and knowledge graph creation automatically.
--- a/docs/guide/export.md
+++ b/docs/guide/export.md
@@ -0,0 +1,756 @@
1
+# Export
2
+
3
+PlanOpticon provides multiple ways to export knowledge graph data into formats suitable for documentation, note-taking, collaboration, and interchange. All export commands work offline from a `knowledge_graph.db` file -- no API key is needed for template-based exports.
4
+
5
+## Overview of export options
6
+
7
+| Format | Command | API Key | Description |
8
+|--------|---------|---------|-------------|
9
+| Markdown documents | `planopticon export markdown` | No | 7 document types: summary, meeting notes, glossary, and more |
10
+| Obsidian vault | `planopticon export obsidian` | No | YAML frontmatter, `[[wiki-links]]`, tag pages, Map of Content |
11
+| Notion-compatible | `planopticon export notion` | No | Callout blocks, CSV database for bulk import |
12
+| PlanOpticonExchange JSON | `planopticon export exchange` | No | Canonical interchange format for merging and sharing |
13
+| GitHub wiki | `planopticon wiki generate` | No | Home, Sidebar, entity pages, type indexes |
14
+| GitHub wiki push | `planopticon wiki push` | Git auth | Push generated wiki to a GitHub repo |
15
+
16
+## Markdown document generator
17
+
18
+The markdown exporter produces structured documents from knowledge graph data using pure template-based generation. No LLM calls are made -- the output is deterministic and based entirely on the entities and relationships in the graph.
19
+
20
+### CLI usage
21
+
22
+```
23
+planopticon export markdown DB_PATH [OPTIONS]
24
+```
25
+
26
+**Arguments:**
27
+
28
+| Argument | Description |
29
+|----------|-------------|
30
+| `DB_PATH` | Path to a `knowledge_graph.db` file |
31
+
32
+**Options:**
33
+
34
+| Option | Short | Default | Description |
35
+|--------|-------|---------|-------------|
36
+| `--output` | `-o` | `./export` | Output directory |
37
+| `--type` | | `all` | Document types to generate (repeatable). Choices: `summary`, `meeting-notes`, `glossary`, `relationship-map`, `status-report`, `entity-index`, `csv`, `all` |
38
+
39
+**Examples:**
40
+
41
+```bash
42
+# Generate all document types
43
+planopticon export markdown knowledge_graph.db
44
+
45
+# Generate only summary and glossary
46
+planopticon export markdown kg.db -o ./docs --type summary --type glossary
47
+
48
+# Generate meeting notes and CSV
49
+planopticon export markdown kg.db --type meeting-notes --type csv
50
+```
51
+
52
+### Document types
53
+
54
+#### summary (Executive Summary)
55
+
56
+A high-level overview of the knowledge graph. Contains:
57
+
58
+- Total entity and relationship counts
59
+- Entity breakdown by type (table with counts and example names)
60
+- Key entities ranked by number of connections (top 10)
61
+- Relationship type breakdown with counts
62
+
63
+This is useful for getting a quick overview of what a knowledge base contains.
64
+
65
+#### meeting-notes (Meeting Notes)
66
+
67
+Formats knowledge graph data as structured meeting notes. Organizes entities into planning-relevant categories:
68
+
69
+- **Discussion Topics**: Entities of type `concept`, `technology`, or `topic` with their descriptions
70
+- **Participants**: Entities of type `person`
71
+- **Decisions & Constraints**: Entities of type `decision` or `constraint`
72
+- **Action Items**: Entities of type `goal`, `feature`, or `milestone`, shown as checkboxes. If an entity has an `assigned_to` or `owned_by` relationship, the owner is shown as `@name`
73
+- **Open Questions / Loose Ends**: Entities with one or fewer relationships (excluding people), indicating topics that may need follow-up
74
+
75
+Includes a generation timestamp.
76
+
77
+#### glossary (Glossary)
78
+
79
+An alphabetically sorted dictionary of all entities in the knowledge graph. Each entry shows:
80
+
81
+- Entity name (bold)
82
+- Entity type (italic, in parentheses)
83
+- First description
84
+
85
+Format:
86
+
87
+```
88
+**Entity Name** *(type)*
89
+: Description text here.
90
+```
91
+
92
+#### relationship-map (Relationship Map)
93
+
94
+A comprehensive view of all relationships in the graph, organized by relationship type. Each type gets its own section with a table of source-target pairs.
95
+
96
+Also includes a **Mermaid diagram** of the top 20 most-connected entities, rendered as a `graph LR` flowchart with labeled edges. This diagram can be rendered natively in GitHub, GitLab, Obsidian, and many other Markdown viewers.
97
+
98
+#### status-report (Status Report)
99
+
100
+A project-oriented status report that highlights planning entities:
101
+
102
+- **Overview**: Counts of entities, relationships, features, milestones, requirements, and risks/constraints
103
+- **Milestones**: Entities of type `milestone` with descriptions
104
+- **Features**: Table of entities of type `feature` with descriptions (truncated to 60 characters)
105
+- **Risks & Constraints**: Entities of type `risk` or `constraint`
106
+
107
+Includes a generation timestamp.
108
+
109
+#### entity-index (Entity Index)
110
+
111
+A master index of all entities grouped by type. Each type section lists entities alphabetically with their first description. Shows total entity count and number of types.
112
+
113
+#### csv (CSV Export)
114
+
115
+A CSV file suitable for spreadsheet import. Columns:
116
+
117
+| Column | Description |
118
+|--------|-------------|
119
+| Name | Entity name |
120
+| Type | Entity type |
121
+| Description | First description |
122
+| Related To | Semicolon-separated list of entities this entity has outgoing relationships to |
123
+| Source | First occurrence source |
124
+
125
+### Entity briefs
126
+
127
+In addition to the selected document types, the `generate_all()` function automatically creates individual entity brief pages in an `entities/` subdirectory. Each brief contains:
128
+
129
+- Entity name and type
130
+- Summary (all descriptions)
131
+- Outgoing relationships (table of target entities and relationship types)
132
+- Incoming relationships (table of source entities and relationship types)
133
+- Source occurrences with timestamps and context text
134
+
135
+## Obsidian vault export
136
+
137
+The Obsidian exporter creates a complete vault structure with YAML frontmatter, `[[wiki-links]]` for entity cross-references, and Obsidian-compatible metadata.
138
+
139
+### CLI usage
140
+
141
+```
142
+planopticon export obsidian DB_PATH [OPTIONS]
143
+```
144
+
145
+**Options:**
146
+
147
+| Option | Short | Default | Description |
148
+|--------|-------|---------|-------------|
149
+| `--output` | `-o` | `./obsidian-vault` | Output vault directory |
150
+
151
+**Example:**
152
+
153
+```bash
154
+planopticon export obsidian knowledge_graph.db -o ./my-vault
155
+```
156
+
157
+### Generated structure
158
+
159
+```
160
+my-vault/
161
+ _Index.md # Map of Content (MOC)
162
+ Tag - Person.md # One tag page per entity type
163
+ Tag - Technology.md
164
+ Tag - Concept.md
165
+ Alice.md # Individual entity notes
166
+ Python.md
167
+ Microservices.md
168
+ ...
169
+```
170
+
171
+### Entity notes
172
+
173
+Each entity gets a dedicated note with:
174
+
175
+**YAML frontmatter:**
176
+
177
+```yaml
178
+---
179
+type: technology
180
+tags:
181
+ - technology
182
+aliases:
183
+ - Python 3
184
+ - CPython
185
+date: 2026-03-07
186
+---
187
+```
188
+
189
+The frontmatter includes:
190
+
191
+- `type`: The entity type
192
+- `tags`: Entity type as a tag (for Obsidian tag-based filtering)
193
+- `aliases`: Any known aliases for the entity (if available)
194
+- `date`: The export date
195
+
196
+**Body content:**
197
+
198
+- `# Entity Name` heading
199
+- Description paragraphs
200
+- `## Relationships` section with `[[wiki-links]]` to related entities:
201
+ ```
202
+ - **uses**: [[FastAPI]]
203
+ - **depends_on**: [[PostgreSQL]]
204
+ ```
205
+- `## Referenced by` section with incoming relationships:
206
+ ```
207
+ - **implements** from [[Backend Service]]
208
+ ```
209
+
210
+### Index note (Map of Content)
211
+
212
+The `_Index.md` file serves as a Map of Content (MOC), listing all entities grouped by type with `[[wiki-links]]`:
213
+
214
+```markdown
215
+---
216
+type: index
217
+tags:
218
+ - MOC
219
+date: 2026-03-07
220
+---
221
+
222
+# Index
223
+
224
+**47** entities | **31** relationships
225
+
226
+## Concept
227
+
228
+- [[Microservices]]
229
+- [[REST API]]
230
+
231
+## Person
232
+
233
+- [[Alice]]
234
+- [[Bob]]
235
+```
236
+
237
+### Tag pages
238
+
239
+One tag page is created per entity type (e.g., `Tag - Person.md`, `Tag - Technology.md`). Each page has frontmatter tagging it with the entity type and lists all entities of that type with descriptions.
240
+
241
+## Notion-compatible markdown export
242
+
243
+The Notion exporter creates Markdown files with Notion-style callout blocks and a CSV database file for bulk import into Notion.
244
+
245
+### CLI usage
246
+
247
+```
248
+planopticon export notion DB_PATH [OPTIONS]
249
+```
250
+
251
+**Options:**
252
+
253
+| Option | Short | Default | Description |
254
+|--------|-------|---------|-------------|
255
+| `--output` | `-o` | `./notion-export` | Output directory |
256
+
257
+**Example:**
258
+
259
+```bash
260
+planopticon export notion knowledge_graph.db -o ./notion-export
261
+```
262
+
263
+### Generated structure
264
+
265
+```
266
+notion-export/
267
+ Overview.md # Knowledge graph overview page
268
+ entities_database.csv # CSV for Notion database import
269
+ Alice.md # Individual entity pages
270
+ Python.md
271
+ ...
272
+```
273
+
274
+### Entity pages
275
+
276
+Each entity page uses Notion-style callout syntax for metadata:
277
+
278
+```markdown
279
+# Python
280
+
281
+> :computer: **Type:** technology
282
+
283
+## Description
284
+
285
+A high-level programming language...
286
+
287
+> :memo: **Properties**
288
+> - **version:** 3.11
289
+> - **paradigm:** multi-paradigm
290
+
291
+## Relationships
292
+
293
+| Target | Relationship |
294
+|--------|-------------|
295
+| FastAPI | uses |
296
+| Django | framework_for |
297
+
298
+## Referenced by
299
+
300
+| Source | Relationship |
301
+|--------|-------------|
302
+| Backend Service | implements |
303
+```
304
+
305
+### CSV database
306
+
307
+The `entities_database.csv` file contains all entities in a format suitable for Notion's CSV database import:
308
+
309
+| Column | Description |
310
+|--------|-------------|
311
+| Name | Entity name |
312
+| Type | Entity type |
313
+| Description | First two descriptions, semicolon-separated |
314
+| Related To | Comma-separated list of outgoing relationship targets |
315
+
316
+### Overview page
317
+
318
+The `Overview.md` page provides a summary with entity counts and a grouped listing of all entities by type.
319
+
320
+## GitHub wiki generator
321
+
322
+The wiki generator creates a complete set of GitHub wiki pages from a knowledge graph, including navigation (Home page and Sidebar) and cross-linked entity pages.
323
+
324
+### CLI usage
325
+
326
+**Generate wiki pages locally:**
327
+
328
+```
329
+planopticon wiki generate DB_PATH [OPTIONS]
330
+```
331
+
332
+| Option | Short | Default | Description |
333
+|--------|-------|---------|-------------|
334
+| `--output` | `-o` | `./wiki` | Output directory for wiki pages |
335
+| `--title` | | `Knowledge Base` | Wiki title (shown on Home page) |
336
+
337
+**Push wiki pages to GitHub:**
338
+
339
+```
340
+planopticon wiki push WIKI_DIR REPO [OPTIONS]
341
+```
342
+
343
+| Argument | Description |
344
+|----------|-------------|
345
+| `WIKI_DIR` | Path to the directory containing generated wiki `.md` files |
346
+| `REPO` | GitHub repository in `owner/repo` format |
347
+
348
+| Option | Short | Default | Description |
349
+|--------|-------|---------|-------------|
350
+| `--message` | `-m` | `Update wiki` | Git commit message |
351
+
352
+**Examples:**
353
+
354
+```bash
355
+# Generate wiki pages
356
+planopticon wiki generate knowledge_graph.db -o ./wiki
357
+
358
+# Generate with a custom title
359
+planopticon wiki generate kg.db -o ./wiki --title "Project Wiki"
360
+
361
+# Push to GitHub
362
+planopticon wiki push ./wiki ConflictHQ/PlanOpticon
363
+
364
+# Push with a custom commit message
365
+planopticon wiki push ./wiki owner/repo -m "Add entity pages"
366
+```
367
+
368
+### Generated pages
369
+
370
+The wiki generator creates the following pages:
371
+
372
+| Page | Description |
373
+|------|-------------|
374
+| `Home.md` | Main wiki page with entity counts, type links, and artifact links |
375
+| `_Sidebar.md` | Navigation sidebar with links to Home, entity type indexes, and artifacts |
376
+| `{Type}.md` | One index page per entity type with a table of entities and descriptions |
377
+| `{Entity}.md` | Individual entity pages with type, descriptions, relationships, and sources |
378
+
379
+### Entity pages
380
+
381
+Each entity page contains:
382
+
383
+- Entity name as the top heading
384
+- **Type** label
385
+- **Descriptions** section (bullet list)
386
+- **Relationships** table with wiki-style links to target entities
387
+- **Referenced By** table with links to source entities
388
+- **Sources** section listing occurrences with timestamps and context
389
+
390
+All entity and type names are cross-linked using GitHub wiki-compatible links (`[Name](Sanitized-Name)`).
391
+
392
+### Push behavior
393
+
394
+The `wiki push` command:
395
+
396
+1. Clones the existing GitHub wiki repository (`https://github.com/{repo}.wiki.git`).
397
+2. If the wiki does not exist yet, initializes a new Git repository.
398
+3. Copies all `.md` files from the wiki directory into the clone.
399
+4. Commits the changes.
400
+5. Pushes to the remote (tries `master` first, then `main`).
401
+
402
+This requires Git authentication with push access to the repository. The wiki must be enabled in the GitHub repository settings.
403
+
404
+## PlanOpticonExchange JSON format
405
+
406
+The PlanOpticonExchange is the canonical interchange format for PlanOpticon data. Every command produces it, and every export adapter can consume it. It provides a structured, versioned JSON representation of a complete knowledge graph with project metadata.
407
+
408
+### CLI usage
409
+
410
+```
411
+planopticon export exchange DB_PATH [OPTIONS]
412
+```
413
+
414
+| Option | Short | Default | Description |
415
+|--------|-------|---------|-------------|
416
+| `--output` | `-o` | `./exchange.json` | Output JSON file path |
417
+| `--name` | | `Untitled` | Project name for the exchange payload |
418
+| `--description` | | (empty) | Project description |
419
+
420
+**Examples:**
421
+
422
+```bash
423
+# Basic export
424
+planopticon export exchange knowledge_graph.db
425
+
426
+# With project metadata
427
+planopticon export exchange kg.db -o exchange.json --name "My Project" --description "Sprint 3 analysis"
428
+```
429
+
430
+### Schema
431
+
432
+The exchange format has the following top-level structure:
433
+
434
+```json
435
+{
436
+ "version": "1.0",
437
+ "project": {
438
+ "name": "My Project",
439
+ "description": "Sprint 3 analysis",
440
+ "created_at": "2026-03-07T10:30:00.000000",
441
+ "updated_at": "2026-03-07T10:30:00.000000",
442
+ "tags": ["sprint-3", "backend"]
443
+ },
444
+ "entities": [
445
+ {
446
+ "name": "Python",
447
+ "type": "technology",
448
+ "descriptions": ["A high-level programming language"],
449
+ "source": "transcript",
450
+ "occurrences": [
451
+ {
452
+ "source": "meeting.mp4",
453
+ "timestamp": "00:05:23",
454
+ "text": "We should use Python for the backend"
455
+ }
456
+ ]
457
+ }
458
+ ],
459
+ "relationships": [
460
+ {
461
+ "source": "Python",
462
+ "target": "Backend Service",
463
+ "type": "used_by",
464
+ "content_source": "transcript:meeting.mp4",
465
+ "timestamp": 323.0
466
+ }
467
+ ],
468
+ "artifacts": [
469
+ {
470
+ "name": "Project Plan",
471
+ "content": "# Project Plan\n\n...",
472
+ "artifact_type": "project_plan",
473
+ "format": "markdown",
474
+ "metadata": {}
475
+ }
476
+ ],
477
+ "sources": [
478
+ {
479
+ "source_id": "abc123",
480
+ "source_type": "video",
481
+ "title": "Sprint Planning Meeting",
482
+ "path": "/recordings/meeting.mp4",
483
+ "url": null,
484
+ "mime_type": "video/mp4",
485
+ "ingested_at": "2026-03-07T10:00:00.000000",
486
+ "metadata": {}
487
+ }
488
+ ]
489
+}
490
+```
491
+
492
+**Top-level fields:**
493
+
494
+| Field | Type | Description |
495
+|-------|------|-------------|
496
+| `version` | `str` | Schema version (currently `"1.0"`) |
497
+| `project` | `ProjectMeta` | Project-level metadata |
498
+| `entities` | `List[Entity]` | Knowledge graph entities |
499
+| `relationships` | `List[Relationship]` | Knowledge graph relationships |
500
+| `artifacts` | `List[ArtifactMeta]` | Generated artifacts (plans, PRDs, etc.) |
501
+| `sources` | `List[SourceRecord]` | Content source provenance records |
502
+
503
+### Merging exchange files
504
+
505
+The exchange format supports merging, with automatic deduplication:
506
+
507
+- Entities are deduplicated by name
508
+- Relationships are deduplicated by the tuple `(source, target, type)`
509
+- Artifacts are deduplicated by name
510
+- Sources are deduplicated by `source_id`
511
+
512
+```python
513
+from video_processor.exchange import PlanOpticonExchange
514
+
515
+# Load two exchange files
516
+ex1 = PlanOpticonExchange.from_file("sprint-1.json")
517
+ex2 = PlanOpticonExchange.from_file("sprint-2.json")
518
+
519
+# Merge ex2 into ex1
520
+ex1.merge(ex2)
521
+
522
+# Save the combined result
523
+ex1.to_file("combined.json")
524
+```
525
+
526
+The `project.updated_at` timestamp is updated automatically on merge.
527
+
528
+### Python API
529
+
530
+**Create from a knowledge graph:**
531
+
532
+```python
533
+from video_processor.exchange import PlanOpticonExchange
534
+from video_processor.integrators.knowledge_graph import KnowledgeGraph
535
+
536
+kg = KnowledgeGraph(db_path="knowledge_graph.db")
537
+kg_data = kg.to_dict()
538
+
539
+exchange = PlanOpticonExchange.from_knowledge_graph(
540
+ kg_data,
541
+ project_name="My Project",
542
+ project_description="Analysis of sprint planning meetings",
543
+ tags=["planning", "backend"],
544
+)
545
+```
546
+
547
+**Save and load:**
548
+
549
+```python
550
+# Save to file
551
+exchange.to_file("exchange.json")
552
+
553
+# Load from file
554
+loaded = PlanOpticonExchange.from_file("exchange.json")
555
+```
556
+
557
+**Get JSON Schema:**
558
+
559
+```python
560
+schema = PlanOpticonExchange.json_schema()
561
+```
562
+
563
+This returns the full JSON Schema for validation and documentation purposes.
564
+
565
+## Python API for all exporters
566
+
567
+### Markdown document generation
568
+
569
+```python
570
+from pathlib import Path
571
+from video_processor.exporters.markdown import (
572
+ generate_all,
573
+ generate_executive_summary,
574
+ generate_meeting_notes,
575
+ generate_glossary,
576
+ generate_relationship_map,
577
+ generate_status_report,
578
+ generate_entity_index,
579
+ generate_csv_export,
580
+ generate_entity_brief,
581
+ DOCUMENT_TYPES,
582
+)
583
+from video_processor.integrators.knowledge_graph import KnowledgeGraph
584
+
585
+kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
586
+kg_data = kg.to_dict()
587
+
588
+# Generate all document types at once
589
+created_files = generate_all(kg_data, Path("./export"))
590
+
591
+# Generate specific document types
592
+created_files = generate_all(
593
+ kg_data,
594
+ Path("./export"),
595
+ doc_types=["summary", "glossary", "csv"],
596
+)
597
+
598
+# Generate individual documents (returns markdown string)
599
+summary = generate_executive_summary(kg_data)
600
+notes = generate_meeting_notes(kg_data, title="Sprint Planning")
601
+glossary = generate_glossary(kg_data)
602
+rel_map = generate_relationship_map(kg_data)
603
+status = generate_status_report(kg_data, title="Q1 Status")
604
+index = generate_entity_index(kg_data)
605
+csv_text = generate_csv_export(kg_data)
606
+
607
+# Generate a brief for a single entity
608
+entity = kg_data["nodes"][0]
609
+relationships = kg_data["relationships"]
610
+brief = generate_entity_brief(entity, relationships)
611
+```
612
+
613
+### Obsidian export
614
+
615
+```python
616
+from pathlib import Path
617
+from video_processor.agent.skills.notes_export import export_to_obsidian
618
+from video_processor.integrators.knowledge_graph import KnowledgeGraph
619
+
620
+kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
621
+kg_data = kg.to_dict()
622
+
623
+created_files = export_to_obsidian(kg_data, Path("./obsidian-vault"))
624
+print(f"Created {len(created_files)} files")
625
+```
626
+
627
+### Notion export
628
+
629
+```python
630
+from pathlib import Path
631
+from video_processor.agent.skills.notes_export import export_to_notion_md
632
+from video_processor.integrators.knowledge_graph import KnowledgeGraph
633
+
634
+kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
635
+kg_data = kg.to_dict()
636
+
637
+created_files = export_to_notion_md(kg_data, Path("./notion-export"))
638
+```
639
+
640
+### Wiki generation
641
+
642
+```python
643
+from pathlib import Path
644
+from video_processor.agent.skills.wiki_generator import (
645
+ generate_wiki,
646
+ write_wiki,
647
+ push_wiki,
648
+)
649
+from video_processor.integrators.knowledge_graph import KnowledgeGraph
650
+
651
+kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
652
+kg_data = kg.to_dict()
653
+
654
+# Generate pages as a dict of {filename: content}
655
+pages = generate_wiki(kg_data, title="Project Wiki")
656
+
657
+# Write to disk
658
+written = write_wiki(pages, Path("./wiki"))
659
+
660
+# Push to GitHub (requires git auth)
661
+success = push_wiki(Path("./wiki"), "owner/repo", message="Update wiki")
662
+```
663
+
664
+## Companion REPL
665
+
666
+Inside the interactive companion REPL, use the `/export` command:
667
+
668
+```
669
+> /export markdown
670
+Export 'markdown' requested. Use the CLI command:
671
+ planopticon export markdown ./knowledge_graph.db
672
+
673
+> /export obsidian
674
+Export 'obsidian' requested. Use the CLI command:
675
+ planopticon export obsidian ./knowledge_graph.db
676
+```
677
+
678
+The REPL provides guidance on the CLI command to run; actual export is performed via the CLI.
679
+
680
+## Common workflows
681
+
682
+### Analyze videos and export to Obsidian
683
+
684
+```bash
685
+# Analyze meeting recordings
686
+planopticon analyze meeting-1.mp4 -o ./results
687
+planopticon analyze meeting-2.mp4 --db-path ./results/knowledge_graph.db
688
+
689
+# Ingest supplementary docs
690
+planopticon ingest ./specs/ --db-path ./results/knowledge_graph.db
691
+
692
+# Export to Obsidian vault
693
+planopticon export obsidian ./results/knowledge_graph.db -o ~/Obsidian/ProjectVault
694
+
695
+# Open in Obsidian and explore the graph view
696
+```
697
+
698
+### Generate project documentation
699
+
700
+```bash
701
+# Generate all markdown documents
702
+planopticon export markdown knowledge_graph.db -o ./docs
703
+
704
+# The output includes:
705
+# docs/summary.md - Executive summary
706
+# docs/meeting-notes.md - Meeting notes format
707
+# docs/glossary.md - Entity glossary
708
+# docs/relationship-map.md - Relationships + Mermaid diagram
709
+# docs/status-report.md - Project status report
710
+# docs/entity-index.md - Master entity index
711
+# docs/csv.csv - Spreadsheet-ready CSV
712
+# docs/entities/ - Individual entity briefs
713
+```
714
+
715
+### Publish a GitHub wiki
716
+
717
+```bash
718
+# Generate wiki pages
719
+planopticon wiki generate knowledge_graph.db -o ./wiki --title "Project Knowledge Base"
720
+
721
+# Review locally, then push
722
+planopticon wiki push ./wiki ConflictHQ/my-project -m "Initial wiki from meeting analysis"
723
+```
724
+
725
+### Share data between projects
726
+
727
+```bash
728
+# Export from project A
729
+planopticon export exchange ./project-a/knowledge_graph.db \
730
+ -o project-a.json --name "Project A"
731
+
732
+# Export from project B
733
+planopticon export exchange ./project-b/knowledge_graph.db \
734
+ -o project-b.json --name "Project B"
735
+
736
+# Merge in Python
737
+python -c "
738
+from video_processor.exchange import PlanOpticonExchange
739
+a = PlanOpticonExchange.from_file('project-a.json')
740
+b = PlanOpticonExchange.from_file('project-b.json')
741
+a.merge(b)
742
+a.to_file('combined.json')
743
+print(f'Combined: {len(a.entities)} entities, {len(a.relationships)} relationships')
744
+"
745
+```
746
+
747
+### Export for spreadsheet analysis
748
+
749
+```bash
750
+# Generate just the CSV
751
+planopticon export markdown knowledge_graph.db --type csv -o ./export
752
+
753
+# The file export/csv.csv can be opened in Excel, Google Sheets, etc.
754
+```
755
+
756
+Alternatively, the Notion export includes an `entities_database.csv` that can be imported into any spreadsheet tool or Notion database.
--- a/docs/guide/export.md
+++ b/docs/guide/export.md
@@ -0,0 +1,756 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/guide/export.md
+++ b/docs/guide/export.md
@@ -0,0 +1,756 @@
1 # Export
2
3 PlanOpticon provides multiple ways to export knowledge graph data into formats suitable for documentation, note-taking, collaboration, and interchange. All export commands work offline from a `knowledge_graph.db` file -- no API key is needed for template-based exports.
4
5 ## Overview of export options
6
7 | Format | Command | API Key | Description |
8 |--------|---------|---------|-------------|
9 | Markdown documents | `planopticon export markdown` | No | 7 document types: summary, meeting notes, glossary, and more |
10 | Obsidian vault | `planopticon export obsidian` | No | YAML frontmatter, `[[wiki-links]]`, tag pages, Map of Content |
11 | Notion-compatible | `planopticon export notion` | No | Callout blocks, CSV database for bulk import |
12 | PlanOpticonExchange JSON | `planopticon export exchange` | No | Canonical interchange format for merging and sharing |
13 | GitHub wiki | `planopticon wiki generate` | No | Home, Sidebar, entity pages, type indexes |
14 | GitHub wiki push | `planopticon wiki push` | Git auth | Push generated wiki to a GitHub repo |
15
16 ## Markdown document generator
17
18 The markdown exporter produces structured documents from knowledge graph data using pure template-based generation. No LLM calls are made -- the output is deterministic and based entirely on the entities and relationships in the graph.
19
20 ### CLI usage
21
22 ```
23 planopticon export markdown DB_PATH [OPTIONS]
24 ```
25
26 **Arguments:**
27
28 | Argument | Description |
29 |----------|-------------|
30 | `DB_PATH` | Path to a `knowledge_graph.db` file |
31
32 **Options:**
33
34 | Option | Short | Default | Description |
35 |--------|-------|---------|-------------|
36 | `--output` | `-o` | `./export` | Output directory |
37 | `--type` | | `all` | Document types to generate (repeatable). Choices: `summary`, `meeting-notes`, `glossary`, `relationship-map`, `status-report`, `entity-index`, `csv`, `all` |
38
39 **Examples:**
40
41 ```bash
42 # Generate all document types
43 planopticon export markdown knowledge_graph.db
44
45 # Generate only summary and glossary
46 planopticon export markdown kg.db -o ./docs --type summary --type glossary
47
48 # Generate meeting notes and CSV
49 planopticon export markdown kg.db --type meeting-notes --type csv
50 ```
51
52 ### Document types
53
54 #### summary (Executive Summary)
55
56 A high-level overview of the knowledge graph. Contains:
57
58 - Total entity and relationship counts
59 - Entity breakdown by type (table with counts and example names)
60 - Key entities ranked by number of connections (top 10)
61 - Relationship type breakdown with counts
62
63 This is useful for getting a quick overview of what a knowledge base contains.
64
65 #### meeting-notes (Meeting Notes)
66
67 Formats knowledge graph data as structured meeting notes. Organizes entities into planning-relevant categories:
68
69 - **Discussion Topics**: Entities of type `concept`, `technology`, or `topic` with their descriptions
70 - **Participants**: Entities of type `person`
71 - **Decisions & Constraints**: Entities of type `decision` or `constraint`
72 - **Action Items**: Entities of type `goal`, `feature`, or `milestone`, shown as checkboxes. If an entity has an `assigned_to` or `owned_by` relationship, the owner is shown as `@name`
73 - **Open Questions / Loose Ends**: Entities with one or fewer relationships (excluding people), indicating topics that may need follow-up
74
75 Includes a generation timestamp.
76
77 #### glossary (Glossary)
78
79 An alphabetically sorted dictionary of all entities in the knowledge graph. Each entry shows:
80
81 - Entity name (bold)
82 - Entity type (italic, in parentheses)
83 - First description
84
85 Format:
86
87 ```
88 **Entity Name** *(type)*
89 : Description text here.
90 ```
91
92 #### relationship-map (Relationship Map)
93
94 A comprehensive view of all relationships in the graph, organized by relationship type. Each type gets its own section with a table of source-target pairs.
95
96 Also includes a **Mermaid diagram** of the top 20 most-connected entities, rendered as a `graph LR` flowchart with labeled edges. This diagram can be rendered natively in GitHub, GitLab, Obsidian, and many other Markdown viewers.
97
98 #### status-report (Status Report)
99
100 A project-oriented status report that highlights planning entities:
101
102 - **Overview**: Counts of entities, relationships, features, milestones, requirements, and risks/constraints
103 - **Milestones**: Entities of type `milestone` with descriptions
104 - **Features**: Table of entities of type `feature` with descriptions (truncated to 60 characters)
105 - **Risks & Constraints**: Entities of type `risk` or `constraint`
106
107 Includes a generation timestamp.
108
109 #### entity-index (Entity Index)
110
111 A master index of all entities grouped by type. Each type section lists entities alphabetically with their first description. Shows total entity count and number of types.
112
113 #### csv (CSV Export)
114
115 A CSV file suitable for spreadsheet import. Columns:
116
117 | Column | Description |
118 |--------|-------------|
119 | Name | Entity name |
120 | Type | Entity type |
121 | Description | First description |
122 | Related To | Semicolon-separated list of entities this entity has outgoing relationships to |
123 | Source | First occurrence source |
124
125 ### Entity briefs
126
127 In addition to the selected document types, the `generate_all()` function automatically creates individual entity brief pages in an `entities/` subdirectory. Each brief contains:
128
129 - Entity name and type
130 - Summary (all descriptions)
131 - Outgoing relationships (table of target entities and relationship types)
132 - Incoming relationships (table of source entities and relationship types)
133 - Source occurrences with timestamps and context text
134
135 ## Obsidian vault export
136
137 The Obsidian exporter creates a complete vault structure with YAML frontmatter, `[[wiki-links]]` for entity cross-references, and Obsidian-compatible metadata.
138
139 ### CLI usage
140
141 ```
142 planopticon export obsidian DB_PATH [OPTIONS]
143 ```
144
145 **Options:**
146
147 | Option | Short | Default | Description |
148 |--------|-------|---------|-------------|
149 | `--output` | `-o` | `./obsidian-vault` | Output vault directory |
150
151 **Example:**
152
153 ```bash
154 planopticon export obsidian knowledge_graph.db -o ./my-vault
155 ```
156
157 ### Generated structure
158
159 ```
160 my-vault/
161 _Index.md # Map of Content (MOC)
162 Tag - Person.md # One tag page per entity type
163 Tag - Technology.md
164 Tag - Concept.md
165 Alice.md # Individual entity notes
166 Python.md
167 Microservices.md
168 ...
169 ```
170
171 ### Entity notes
172
173 Each entity gets a dedicated note with:
174
175 **YAML frontmatter:**
176
177 ```yaml
178 ---
179 type: technology
180 tags:
181 - technology
182 aliases:
183 - Python 3
184 - CPython
185 date: 2026-03-07
186 ---
187 ```
188
189 The frontmatter includes:
190
191 - `type`: The entity type
192 - `tags`: Entity type as a tag (for Obsidian tag-based filtering)
193 - `aliases`: Any known aliases for the entity (if available)
194 - `date`: The export date
195
196 **Body content:**
197
198 - `# Entity Name` heading
199 - Description paragraphs
200 - `## Relationships` section with `[[wiki-links]]` to related entities:
201 ```
202 - **uses**: [[FastAPI]]
203 - **depends_on**: [[PostgreSQL]]
204 ```
205 - `## Referenced by` section with incoming relationships:
206 ```
207 - **implements** from [[Backend Service]]
208 ```
209
210 ### Index note (Map of Content)
211
212 The `_Index.md` file serves as a Map of Content (MOC), listing all entities grouped by type with `[[wiki-links]]`:
213
214 ```markdown
215 ---
216 type: index
217 tags:
218 - MOC
219 date: 2026-03-07
220 ---
221
222 # Index
223
224 **47** entities | **31** relationships
225
226 ## Concept
227
228 - [[Microservices]]
229 - [[REST API]]
230
231 ## Person
232
233 - [[Alice]]
234 - [[Bob]]
235 ```
236
237 ### Tag pages
238
239 One tag page is created per entity type (e.g., `Tag - Person.md`, `Tag - Technology.md`). Each page has frontmatter tagging it with the entity type and lists all entities of that type with descriptions.
240
241 ## Notion-compatible markdown export
242
243 The Notion exporter creates Markdown files with Notion-style callout blocks and a CSV database file for bulk import into Notion.
244
245 ### CLI usage
246
247 ```
248 planopticon export notion DB_PATH [OPTIONS]
249 ```
250
251 **Options:**
252
253 | Option | Short | Default | Description |
254 |--------|-------|---------|-------------|
255 | `--output` | `-o` | `./notion-export` | Output directory |
256
257 **Example:**
258
259 ```bash
260 planopticon export notion knowledge_graph.db -o ./notion-export
261 ```
262
263 ### Generated structure
264
265 ```
266 notion-export/
267 Overview.md # Knowledge graph overview page
268 entities_database.csv # CSV for Notion database import
269 Alice.md # Individual entity pages
270 Python.md
271 ...
272 ```
273
274 ### Entity pages
275
276 Each entity page uses Notion-style callout syntax for metadata:
277
278 ```markdown
279 # Python
280
281 > :computer: **Type:** technology
282
283 ## Description
284
285 A high-level programming language...
286
287 > :memo: **Properties**
288 > - **version:** 3.11
289 > - **paradigm:** multi-paradigm
290
291 ## Relationships
292
293 | Target | Relationship |
294 |--------|-------------|
295 | FastAPI | uses |
296 | Django | framework_for |
297
298 ## Referenced by
299
300 | Source | Relationship |
301 |--------|-------------|
302 | Backend Service | implements |
303 ```
304
305 ### CSV database
306
307 The `entities_database.csv` file contains all entities in a format suitable for Notion's CSV database import:
308
309 | Column | Description |
310 |--------|-------------|
311 | Name | Entity name |
312 | Type | Entity type |
313 | Description | First two descriptions, semicolon-separated |
314 | Related To | Comma-separated list of outgoing relationship targets |
315
316 ### Overview page
317
318 The `Overview.md` page provides a summary with entity counts and a grouped listing of all entities by type.
319
320 ## GitHub wiki generator
321
322 The wiki generator creates a complete set of GitHub wiki pages from a knowledge graph, including navigation (Home page and Sidebar) and cross-linked entity pages.
323
324 ### CLI usage
325
326 **Generate wiki pages locally:**
327
328 ```
329 planopticon wiki generate DB_PATH [OPTIONS]
330 ```
331
332 | Option | Short | Default | Description |
333 |--------|-------|---------|-------------|
334 | `--output` | `-o` | `./wiki` | Output directory for wiki pages |
335 | `--title` | | `Knowledge Base` | Wiki title (shown on Home page) |
336
337 **Push wiki pages to GitHub:**
338
339 ```
340 planopticon wiki push WIKI_DIR REPO [OPTIONS]
341 ```
342
343 | Argument | Description |
344 |----------|-------------|
345 | `WIKI_DIR` | Path to the directory containing generated wiki `.md` files |
346 | `REPO` | GitHub repository in `owner/repo` format |
347
348 | Option | Short | Default | Description |
349 |--------|-------|---------|-------------|
350 | `--message` | `-m` | `Update wiki` | Git commit message |
351
352 **Examples:**
353
354 ```bash
355 # Generate wiki pages
356 planopticon wiki generate knowledge_graph.db -o ./wiki
357
358 # Generate with a custom title
359 planopticon wiki generate kg.db -o ./wiki --title "Project Wiki"
360
361 # Push to GitHub
362 planopticon wiki push ./wiki ConflictHQ/PlanOpticon
363
364 # Push with a custom commit message
365 planopticon wiki push ./wiki owner/repo -m "Add entity pages"
366 ```
367
368 ### Generated pages
369
370 The wiki generator creates the following pages:
371
372 | Page | Description |
373 |------|-------------|
374 | `Home.md` | Main wiki page with entity counts, type links, and artifact links |
375 | `_Sidebar.md` | Navigation sidebar with links to Home, entity type indexes, and artifacts |
376 | `{Type}.md` | One index page per entity type with a table of entities and descriptions |
377 | `{Entity}.md` | Individual entity pages with type, descriptions, relationships, and sources |
378
379 ### Entity pages
380
381 Each entity page contains:
382
383 - Entity name as the top heading
384 - **Type** label
385 - **Descriptions** section (bullet list)
386 - **Relationships** table with wiki-style links to target entities
387 - **Referenced By** table with links to source entities
388 - **Sources** section listing occurrences with timestamps and context
389
390 All entity and type names are cross-linked using GitHub wiki-compatible links (`[Name](Sanitized-Name)`).
391
392 ### Push behavior
393
394 The `wiki push` command:
395
396 1. Clones the existing GitHub wiki repository (`https://github.com/{repo}.wiki.git`).
397 2. If the wiki does not exist yet, initializes a new Git repository.
398 3. Copies all `.md` files from the wiki directory into the clone.
399 4. Commits the changes.
400 5. Pushes to the remote (tries `master` first, then `main`).
401
402 This requires Git authentication with push access to the repository. The wiki must be enabled in the GitHub repository settings.
403
404 ## PlanOpticonExchange JSON format
405
406 The PlanOpticonExchange is the canonical interchange format for PlanOpticon data. Every command produces it, and every export adapter can consume it. It provides a structured, versioned JSON representation of a complete knowledge graph with project metadata.
407
408 ### CLI usage
409
410 ```
411 planopticon export exchange DB_PATH [OPTIONS]
412 ```
413
414 | Option | Short | Default | Description |
415 |--------|-------|---------|-------------|
416 | `--output` | `-o` | `./exchange.json` | Output JSON file path |
417 | `--name` | | `Untitled` | Project name for the exchange payload |
418 | `--description` | | (empty) | Project description |
419
420 **Examples:**
421
422 ```bash
423 # Basic export
424 planopticon export exchange knowledge_graph.db
425
426 # With project metadata
427 planopticon export exchange kg.db -o exchange.json --name "My Project" --description "Sprint 3 analysis"
428 ```
429
430 ### Schema
431
432 The exchange format has the following top-level structure:
433
434 ```json
435 {
436 "version": "1.0",
437 "project": {
438 "name": "My Project",
439 "description": "Sprint 3 analysis",
440 "created_at": "2026-03-07T10:30:00.000000",
441 "updated_at": "2026-03-07T10:30:00.000000",
442 "tags": ["sprint-3", "backend"]
443 },
444 "entities": [
445 {
446 "name": "Python",
447 "type": "technology",
448 "descriptions": ["A high-level programming language"],
449 "source": "transcript",
450 "occurrences": [
451 {
452 "source": "meeting.mp4",
453 "timestamp": "00:05:23",
454 "text": "We should use Python for the backend"
455 }
456 ]
457 }
458 ],
459 "relationships": [
460 {
461 "source": "Python",
462 "target": "Backend Service",
463 "type": "used_by",
464 "content_source": "transcript:meeting.mp4",
465 "timestamp": 323.0
466 }
467 ],
468 "artifacts": [
469 {
470 "name": "Project Plan",
471 "content": "# Project Plan\n\n...",
472 "artifact_type": "project_plan",
473 "format": "markdown",
474 "metadata": {}
475 }
476 ],
477 "sources": [
478 {
479 "source_id": "abc123",
480 "source_type": "video",
481 "title": "Sprint Planning Meeting",
482 "path": "/recordings/meeting.mp4",
483 "url": null,
484 "mime_type": "video/mp4",
485 "ingested_at": "2026-03-07T10:00:00.000000",
486 "metadata": {}
487 }
488 ]
489 }
490 ```
491
492 **Top-level fields:**
493
494 | Field | Type | Description |
495 |-------|------|-------------|
496 | `version` | `str` | Schema version (currently `"1.0"`) |
497 | `project` | `ProjectMeta` | Project-level metadata |
498 | `entities` | `List[Entity]` | Knowledge graph entities |
499 | `relationships` | `List[Relationship]` | Knowledge graph relationships |
500 | `artifacts` | `List[ArtifactMeta]` | Generated artifacts (plans, PRDs, etc.) |
501 | `sources` | `List[SourceRecord]` | Content source provenance records |
502
503 ### Merging exchange files
504
505 The exchange format supports merging, with automatic deduplication:
506
507 - Entities are deduplicated by name
508 - Relationships are deduplicated by the tuple `(source, target, type)`
509 - Artifacts are deduplicated by name
510 - Sources are deduplicated by `source_id`
511
512 ```python
513 from video_processor.exchange import PlanOpticonExchange
514
515 # Load two exchange files
516 ex1 = PlanOpticonExchange.from_file("sprint-1.json")
517 ex2 = PlanOpticonExchange.from_file("sprint-2.json")
518
519 # Merge ex2 into ex1
520 ex1.merge(ex2)
521
522 # Save the combined result
523 ex1.to_file("combined.json")
524 ```
525
526 The `project.updated_at` timestamp is updated automatically on merge.
527
528 ### Python API
529
530 **Create from a knowledge graph:**
531
532 ```python
533 from video_processor.exchange import PlanOpticonExchange
534 from video_processor.integrators.knowledge_graph import KnowledgeGraph
535
536 kg = KnowledgeGraph(db_path="knowledge_graph.db")
537 kg_data = kg.to_dict()
538
539 exchange = PlanOpticonExchange.from_knowledge_graph(
540 kg_data,
541 project_name="My Project",
542 project_description="Analysis of sprint planning meetings",
543 tags=["planning", "backend"],
544 )
545 ```
546
547 **Save and load:**
548
549 ```python
550 # Save to file
551 exchange.to_file("exchange.json")
552
553 # Load from file
554 loaded = PlanOpticonExchange.from_file("exchange.json")
555 ```
556
557 **Get JSON Schema:**
558
559 ```python
560 schema = PlanOpticonExchange.json_schema()
561 ```
562
563 This returns the full JSON Schema for validation and documentation purposes.
564
565 ## Python API for all exporters
566
567 ### Markdown document generation
568
569 ```python
570 from pathlib import Path
571 from video_processor.exporters.markdown import (
572 generate_all,
573 generate_executive_summary,
574 generate_meeting_notes,
575 generate_glossary,
576 generate_relationship_map,
577 generate_status_report,
578 generate_entity_index,
579 generate_csv_export,
580 generate_entity_brief,
581 DOCUMENT_TYPES,
582 )
583 from video_processor.integrators.knowledge_graph import KnowledgeGraph
584
585 kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
586 kg_data = kg.to_dict()
587
588 # Generate all document types at once
589 created_files = generate_all(kg_data, Path("./export"))
590
591 # Generate specific document types
592 created_files = generate_all(
593 kg_data,
594 Path("./export"),
595 doc_types=["summary", "glossary", "csv"],
596 )
597
598 # Generate individual documents (returns markdown string)
599 summary = generate_executive_summary(kg_data)
600 notes = generate_meeting_notes(kg_data, title="Sprint Planning")
601 glossary = generate_glossary(kg_data)
602 rel_map = generate_relationship_map(kg_data)
603 status = generate_status_report(kg_data, title="Q1 Status")
604 index = generate_entity_index(kg_data)
605 csv_text = generate_csv_export(kg_data)
606
607 # Generate a brief for a single entity
608 entity = kg_data["nodes"][0]
609 relationships = kg_data["relationships"]
610 brief = generate_entity_brief(entity, relationships)
611 ```
612
613 ### Obsidian export
614
615 ```python
616 from pathlib import Path
617 from video_processor.agent.skills.notes_export import export_to_obsidian
618 from video_processor.integrators.knowledge_graph import KnowledgeGraph
619
620 kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
621 kg_data = kg.to_dict()
622
623 created_files = export_to_obsidian(kg_data, Path("./obsidian-vault"))
624 print(f"Created {len(created_files)} files")
625 ```
626
627 ### Notion export
628
629 ```python
630 from pathlib import Path
631 from video_processor.agent.skills.notes_export import export_to_notion_md
632 from video_processor.integrators.knowledge_graph import KnowledgeGraph
633
634 kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
635 kg_data = kg.to_dict()
636
637 created_files = export_to_notion_md(kg_data, Path("./notion-export"))
638 ```
639
640 ### Wiki generation
641
642 ```python
643 from pathlib import Path
644 from video_processor.agent.skills.wiki_generator import (
645 generate_wiki,
646 write_wiki,
647 push_wiki,
648 )
649 from video_processor.integrators.knowledge_graph import KnowledgeGraph
650
651 kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
652 kg_data = kg.to_dict()
653
654 # Generate pages as a dict of {filename: content}
655 pages = generate_wiki(kg_data, title="Project Wiki")
656
657 # Write to disk
658 written = write_wiki(pages, Path("./wiki"))
659
660 # Push to GitHub (requires git auth)
661 success = push_wiki(Path("./wiki"), "owner/repo", message="Update wiki")
662 ```
663
664 ## Companion REPL
665
666 Inside the interactive companion REPL, use the `/export` command:
667
668 ```
669 > /export markdown
670 Export 'markdown' requested. Use the CLI command:
671 planopticon export markdown ./knowledge_graph.db
672
673 > /export obsidian
674 Export 'obsidian' requested. Use the CLI command:
675 planopticon export obsidian ./knowledge_graph.db
676 ```
677
678 The REPL provides guidance on the CLI command to run; actual export is performed via the CLI.
679
680 ## Common workflows
681
682 ### Analyze videos and export to Obsidian
683
684 ```bash
685 # Analyze meeting recordings
686 planopticon analyze meeting-1.mp4 -o ./results
687 planopticon analyze meeting-2.mp4 --db-path ./results/knowledge_graph.db
688
689 # Ingest supplementary docs
690 planopticon ingest ./specs/ --db-path ./results/knowledge_graph.db
691
692 # Export to Obsidian vault
693 planopticon export obsidian ./results/knowledge_graph.db -o ~/Obsidian/ProjectVault
694
695 # Open in Obsidian and explore the graph view
696 ```
697
698 ### Generate project documentation
699
700 ```bash
701 # Generate all markdown documents
702 planopticon export markdown knowledge_graph.db -o ./docs
703
704 # The output includes:
705 # docs/summary.md - Executive summary
706 # docs/meeting-notes.md - Meeting notes format
707 # docs/glossary.md - Entity glossary
708 # docs/relationship-map.md - Relationships + Mermaid diagram
709 # docs/status-report.md - Project status report
710 # docs/entity-index.md - Master entity index
711 # docs/csv.csv - Spreadsheet-ready CSV
712 # docs/entities/ - Individual entity briefs
713 ```
714
715 ### Publish a GitHub wiki
716
717 ```bash
718 # Generate wiki pages
719 planopticon wiki generate knowledge_graph.db -o ./wiki --title "Project Knowledge Base"
720
721 # Review locally, then push
722 planopticon wiki push ./wiki ConflictHQ/my-project -m "Initial wiki from meeting analysis"
723 ```
724
725 ### Share data between projects
726
727 ```bash
728 # Export from project A
729 planopticon export exchange ./project-a/knowledge_graph.db \
730 -o project-a.json --name "Project A"
731
732 # Export from project B
733 planopticon export exchange ./project-b/knowledge_graph.db \
734 -o project-b.json --name "Project B"
735
736 # Merge in Python
737 python -c "
738 from video_processor.exchange import PlanOpticonExchange
739 a = PlanOpticonExchange.from_file('project-a.json')
740 b = PlanOpticonExchange.from_file('project-b.json')
741 a.merge(b)
742 a.to_file('combined.json')
743 print(f'Combined: {len(a.entities)} entities, {len(a.relationships)} relationships')
744 "
745 ```
746
747 ### Export for spreadsheet analysis
748
749 ```bash
750 # Generate just the CSV
751 planopticon export markdown knowledge_graph.db --type csv -o ./export
752
753 # The file export/csv.csv can be opened in Excel, Google Sheets, etc.
754 ```
755
756 Alternatively, the Notion export includes an `entities_database.csv` that can be imported into any spreadsheet tool or Notion database.
--- a/docs/guide/knowledge-graphs.md
+++ b/docs/guide/knowledge-graphs.md
@@ -0,0 +1,650 @@
1
+# Knowledge Graphs
2
+
3
+PlanOpticon builds structured knowledge graphs from video analyses, document ingestion, and other content sources. A knowledge graph captures **entities** (people, technologies, concepts, organizations) and the **relationships** between them, providing a queryable representation of everything discussed or presented in your source material.
4
+
5
+---
6
+
7
+## Storage
8
+
9
+Knowledge graphs are stored as SQLite databases (`knowledge_graph.db`) using Python's built-in `sqlite3` module. This means:
10
+
11
+- **Zero external dependencies.** No database server to install or manage.
12
+- **Single-file portability.** Copy the `.db` file to share a knowledge graph.
13
+- **WAL mode.** SQLite Write-Ahead Logging is enabled for concurrent read performance.
14
+- **JSON fallback.** Knowledge graphs can also be saved as `knowledge_graph.json` for interoperability, though SQLite is preferred for performance and querying.
15
+
16
+### Database Schema
17
+
18
+The SQLite store uses the following tables:
19
+
20
+| Table | Purpose |
21
+|---|---|
22
+| `entities` | Core entity records with name, type, descriptions, source, and arbitrary properties |
23
+| `occurrences` | Where and when each entity was mentioned (source, timestamp, text snippet) |
24
+| `relationships` | Directed edges between entities with type, content source, timestamp, and properties |
25
+| `sources` | Registered content sources with provenance metadata (source type, title, path, URL, MIME type, ingestion timestamp) |
26
+| `source_locations` | Links between sources and specific entities/relationships, with location details (timestamp, page, section, line range, text snippet) |
27
+
28
+All entity lookups are case-insensitive (indexed on `name_lower`). Entities and relationships are indexed on their source and target fields for efficient traversal.
29
+
30
+### Storage Backends
31
+
32
+PlanOpticon supports two storage backends, selected automatically:
33
+
34
+| Backend | When Used | Persistence |
35
+|---|---|---|
36
+| `SQLiteStore` | When a `db_path` is provided | Persistent on disk |
37
+| `InMemoryStore` | When no path is given, or as fallback | In-memory only |
38
+
39
+Both backends implement the same `GraphStore` abstract interface, so all query and manipulation code works identically regardless of backend.
40
+
41
+```python
42
+from video_processor.integrators.graph_store import create_store
43
+
44
+# Persistent SQLite store
45
+store = create_store("/path/to/knowledge_graph.db")
46
+
47
+# In-memory store (for temporary operations)
48
+store = create_store()
49
+```
50
+
51
+---
52
+
53
+## Entity Types
54
+
55
+Entities extracted from content are assigned one of the following base types:
56
+
57
+| Type | Description | Specificity Rank |
58
+|---|---|---|
59
+| `person` | People mentioned or participating | 3 (highest) |
60
+| `technology` | Tools, languages, frameworks, platforms | 3 |
61
+| `organization` | Companies, teams, departments | 2 |
62
+| `time` | Dates, deadlines, time references | 1 |
63
+| `diagram` | Visual diagrams extracted from video frames | 1 |
64
+| `concept` | General concepts, topics, ideas (default) | 0 (lowest) |
65
+
66
+The specificity rank is used during merge operations: when two entities are matched as duplicates, the more specific type wins (e.g., `technology` overrides `concept`).
67
+
68
+### Planning Taxonomy
69
+
70
+Beyond the base entity types, PlanOpticon includes a planning taxonomy for classifying entities into project-planning categories. The `TaxonomyClassifier` maps extracted entities into these types:
71
+
72
+| Planning Type | Keywords Matched |
73
+|---|---|
74
+| `goal` | goal, objective, aim, target outcome |
75
+| `requirement` | must, should, requirement, need, required |
76
+| `constraint` | constraint, limitation, restrict, cannot, must not |
77
+| `decision` | decided, decision, chose, selected, agreed |
78
+| `risk` | risk, concern, worry, danger, threat |
79
+| `assumption` | assume, assumption, expecting, presume |
80
+| `dependency` | depends, dependency, relies on, prerequisite, blocked |
81
+| `milestone` | milestone, deadline, deliverable, release, launch |
82
+| `task` | task, todo, action item, work item, implement |
83
+| `feature` | feature, capability, functionality |
84
+
85
+Classification works in two stages:
86
+
87
+1. **Heuristic classification.** Entity descriptions are scanned for the keywords listed above. First match wins.
88
+2. **LLM refinement.** If an LLM provider is available, entities are sent to the LLM for more nuanced classification with priority assignment (`high`, `medium`, `low`). LLM results override heuristic results on conflicts.
89
+
90
+Classified entities are used by planning agent skills (project_plan, prd, roadmap, task_breakdown) to produce targeted, context-aware artifacts.
91
+
92
+---
93
+
94
+## Relationship Types
95
+
96
+Relationships are directed edges between entities. The `type` field is a free-text string determined by the LLM during extraction. Common relationship types include:
97
+
98
+- `related_to` (default)
99
+- `works_with`
100
+- `uses`
101
+- `depends_on`
102
+- `proposed`
103
+- `discussed_by`
104
+- `employed_by`
105
+- `collaborates_with`
106
+- `expert_in`
107
+
108
+### Typed Relationships
109
+
110
+The `add_typed_relationship()` method creates edges with custom labels and optional properties, enabling richer graph semantics:
111
+
112
+```python
113
+store.add_typed_relationship(
114
+ source="Authentication Service",
115
+ target="PostgreSQL",
116
+ edge_label="USES_SYSTEM",
117
+ properties={"purpose": "user credential storage", "version": "15"},
118
+)
119
+```
120
+
121
+### Relationship Checks
122
+
123
+You can check whether a relationship exists between two entities:
124
+
125
+```python
126
+# Check for any relationship
127
+store.has_relationship("Alice", "Kubernetes")
128
+
129
+# Check for a specific relationship type
130
+store.has_relationship("Alice", "Kubernetes", edge_label="expert_in")
131
+```
132
+
133
+---
134
+
135
+## Building a Knowledge Graph
136
+
137
+### From Video Analysis
138
+
139
+The primary path for building a knowledge graph is through video analysis. When you run `planopticon analyze`, the pipeline extracts entities and relationships from:
140
+
141
+- **Transcript segments** -- batched in groups of 10 for efficient API usage, with speaker identification
142
+- **Diagram content** -- text extracted from visual diagrams detected in video frames
143
+
144
+```bash
145
+planopticon analyze -i meeting.mp4 -o results/
146
+# Creates results/knowledge_graph.db
147
+```
148
+
149
+### From Document Ingestion
150
+
151
+Documents (Markdown, PDF, DOCX) can be ingested directly into a knowledge graph:
152
+
153
+```bash
154
+# Ingest a single file
155
+planopticon ingest -i requirements.pdf -o results/
156
+
157
+# Ingest a directory recursively
158
+planopticon ingest -i docs/ -o results/ --recursive
159
+
160
+# Ingest into an existing knowledge graph
161
+planopticon ingest -i notes.md --db results/knowledge_graph.db
162
+```
163
+
164
+### From Batch Processing
165
+
166
+Multiple videos can be processed in batch mode, with all results merged into a single knowledge graph:
167
+
168
+```bash
169
+planopticon batch -i videos/ -o results/
170
+```
171
+
172
+### Programmatic Construction
173
+
174
+```python
175
+from video_processor.integrators.knowledge_graph import KnowledgeGraph
176
+
177
+# Create a new knowledge graph with LLM extraction
178
+from video_processor.providers.manager import ProviderManager
179
+pm = ProviderManager()
180
+kg = KnowledgeGraph(provider_manager=pm, db_path="knowledge_graph.db")
181
+
182
+# Add content (entities and relationships are extracted by LLM)
183
+kg.add_content(
184
+ text="Alice proposed using Kubernetes for container orchestration.",
185
+ source="meeting_notes",
186
+ timestamp=120.5,
187
+)
188
+
189
+# Process a full transcript
190
+kg.process_transcript(transcript_data, batch_size=10)
191
+
192
+# Process diagram results
193
+kg.process_diagrams(diagram_results)
194
+
195
+# Save
196
+kg.save("knowledge_graph.db")
197
+```
198
+
199
+---
200
+
201
+## Merge and Deduplication
202
+
203
+When combining knowledge graphs from multiple sources, PlanOpticon performs intelligent merge with deduplication.
204
+
205
+### Fuzzy Name Matching
206
+
207
+Entity names are compared using Python's `SequenceMatcher` with a threshold of 0.85. This means "Kubernetes" and "kubernetes" are matched exactly (case-insensitive), while "React.js" and "ReactJS" may be matched as duplicates if their similarity ratio meets the threshold.
208
+
209
+### Type Conflict Resolution
210
+
211
+When two entities match but have different types, the more specific type wins based on the specificity ranking:
212
+
213
+| Scenario | Result |
214
+|---|---|
215
+| `concept` vs `technology` | `technology` wins (rank 3 > rank 0) |
216
+| `person` vs `concept` | `person` wins (rank 3 > rank 0) |
217
+| `organization` vs `concept` | `organization` wins (rank 2 > rank 0) |
218
+| `person` vs `technology` | Keeps whichever was first (equal rank) |
219
+
220
+### Provenance Tracking
221
+
222
+Merged entities receive a `merged_from:<original_name>` description entry, preserving the audit trail of which entities were unified.
223
+
224
+### Programmatic Merge
225
+
226
+```python
227
+from video_processor.integrators.knowledge_graph import KnowledgeGraph
228
+
229
+# Load two knowledge graphs
230
+kg1 = KnowledgeGraph(db_path="project_a.db")
231
+kg2 = KnowledgeGraph(db_path="project_b.db")
232
+
233
+# Merge kg2 into kg1
234
+kg1.merge(kg2)
235
+
236
+# Save the merged result
237
+kg1.save("merged.db")
238
+```
239
+
240
+The merge operation also copies all registered sources and occurrences, so provenance information is preserved across merges.
241
+
242
+---
243
+
244
+## Querying
245
+
246
+PlanOpticon provides two query modes: direct mode (no LLM required) and agentic mode (LLM-powered natural language).
247
+
248
+### Direct Mode
249
+
250
+Direct mode queries are fast, deterministic, and require no API key. They are the right choice for structured lookups.
251
+
252
+#### Stats
253
+
254
+Return entity count, relationship count, and entity type breakdown:
255
+
256
+```bash
257
+planopticon query
258
+```
259
+
260
+```python
261
+engine.stats()
262
+# QueryResult with data: {
263
+# "entity_count": 42,
264
+# "relationship_count": 87,
265
+# "entity_types": {"technology": 15, "person": 12, ...}
266
+# }
267
+```
268
+
269
+#### Entities
270
+
271
+Filter entities by name substring and/or type:
272
+
273
+```bash
274
+planopticon query "entities --type technology"
275
+planopticon query "entities --name python"
276
+```
277
+
278
+```python
279
+engine.entities(entity_type="technology")
280
+engine.entities(name="python")
281
+engine.entities(name="auth", entity_type="concept", limit=10)
282
+```
283
+
284
+All filtering is case-insensitive. Results are capped at 50 by default (configurable via `limit`).
285
+
286
+#### Neighbors
287
+
288
+Get an entity and all directly connected nodes and relationships:
289
+
290
+```bash
291
+planopticon query "neighbors Alice"
292
+```
293
+
294
+```python
295
+engine.neighbors("Alice", depth=1)
296
+```
297
+
298
+The `depth` parameter controls how many hops to traverse (default 1). The result includes both entity objects and relationship objects.
299
+
300
+#### Relationships
301
+
302
+Filter relationships by source, target, and/or type:
303
+
304
+```bash
305
+planopticon query "relationships --source Alice"
306
+```
307
+
308
+```python
309
+engine.relationships(source="Alice")
310
+engine.relationships(target="Kubernetes", rel_type="uses")
311
+```
312
+
313
+#### Sources
314
+
315
+List all registered content sources:
316
+
317
+```python
318
+engine.sources()
319
+```
320
+
321
+#### Provenance
322
+
323
+Get all source locations for a specific entity, showing exactly where it was mentioned:
324
+
325
+```python
326
+engine.provenance("Kubernetes")
327
+# Returns source locations with timestamps, pages, sections, and text snippets
328
+```
329
+
330
+#### Raw SQL
331
+
332
+Execute arbitrary SQL against the SQLite backend (SQLite stores only):
333
+
334
+```python
335
+engine.sql("SELECT name, type FROM entities WHERE type = 'technology' ORDER BY name")
336
+```
337
+
338
+### Agentic Mode
339
+
340
+Agentic mode accepts natural-language questions and uses the LLM to plan and execute queries. It requires a configured LLM provider.
341
+
342
+```bash
343
+planopticon query "What technologies were discussed?"
344
+planopticon query "Who are the key people mentioned?"
345
+planopticon query "What depends on the authentication service?"
346
+```
347
+
348
+The agentic query pipeline:
349
+
350
+1. **Plan.** The LLM receives graph stats and available actions (entities, relationships, neighbors, stats). It selects exactly one action and its parameters.
351
+2. **Execute.** The chosen action is run through the direct-mode engine.
352
+3. **Synthesize.** The LLM receives the raw query results and the original question, then produces a concise natural-language answer.
353
+
354
+This design ensures the LLM never generates arbitrary code -- it only selects from a fixed set of known query actions.
355
+
356
+```bash
357
+# Requires an API key
358
+planopticon query "What technologies were discussed?" -p openai
359
+
360
+# Use the interactive REPL for multiple queries
361
+planopticon query -I
362
+```
363
+
364
+---
365
+
366
+## Graph Query Engine Python API
367
+
368
+The `GraphQueryEngine` class provides the programmatic interface for all query operations.
369
+
370
+### Initialization
371
+
372
+```python
373
+from video_processor.integrators.graph_query import GraphQueryEngine
374
+from video_processor.integrators.graph_discovery import find_nearest_graph
375
+
376
+# From a .db file
377
+path = find_nearest_graph()
378
+engine = GraphQueryEngine.from_db_path(path)
379
+
380
+# From a .json file
381
+engine = GraphQueryEngine.from_json_path("knowledge_graph.json")
382
+
383
+# With an LLM provider for agentic mode
384
+from video_processor.providers.manager import ProviderManager
385
+pm = ProviderManager()
386
+engine = GraphQueryEngine.from_db_path(path, provider_manager=pm)
387
+```
388
+
389
+### QueryResult
390
+
391
+All query methods return a `QueryResult` dataclass with multiple output formats:
392
+
393
+```python
394
+result = engine.stats()
395
+
396
+# Human-readable text
397
+print(result.to_text())
398
+
399
+# JSON string
400
+print(result.to_json())
401
+
402
+# Mermaid diagram (for graph results)
403
+result = engine.neighbors("Alice")
404
+print(result.to_mermaid())
405
+```
406
+
407
+The `QueryResult` contains:
408
+
409
+| Field | Type | Description |
410
+|---|---|---|
411
+| `data` | Any | The raw result data (dict, list, or scalar) |
412
+| `query_type` | str | `"filter"` for direct mode, `"agentic"` for LLM mode, `"sql"` for raw SQL |
413
+| `raw_query` | str | String representation of the executed query |
414
+| `explanation` | str | Human-readable explanation or LLM-synthesized answer |
415
+
416
+---
417
+
418
+## The Self-Contained HTML Viewer
419
+
420
+PlanOpticon includes a zero-dependency HTML knowledge graph viewer at `knowledge-base/viewer.html`. This file is fully self-contained -- it inlines D3.js and requires no build step, no server, and no internet connection.
421
+
422
+To use it, open `viewer.html` in a browser. It will load and visualize a `knowledge_graph.json` file (place it in the same directory, or use the file picker in the viewer).
423
+
424
+The viewer provides:
425
+
426
+- Interactive force-directed graph layout
427
+- Zoom and pan navigation
428
+- Entity nodes colored by type
429
+- Relationship edges with labels
430
+- Click-to-focus on individual entities
431
+- Entity detail panel showing descriptions and connections
432
+
433
+This covers approximately 80% of graph exploration needs with zero infrastructure.
434
+
435
+---
436
+
437
+## KG Management Commands
438
+
439
+The `planopticon kg` command group provides utilities for managing knowledge graph files.
440
+
441
+### kg convert
442
+
443
+Convert a knowledge graph between SQLite and JSON formats:
444
+
445
+```bash
446
+# SQLite to JSON
447
+planopticon kg convert results/knowledge_graph.db output.json
448
+
449
+# JSON to SQLite
450
+planopticon kg convert knowledge_graph.json knowledge_graph.db
451
+```
452
+
453
+The output format is inferred from the destination file extension. Source and destination must be different formats.
454
+
455
+### kg sync
456
+
457
+Synchronize a `.db` and `.json` knowledge graph, updating the stale one:
458
+
459
+```bash
460
+# Auto-detect which is newer and sync
461
+planopticon kg sync results/knowledge_graph.db
462
+
463
+# Explicit JSON path
464
+planopticon kg sync knowledge_graph.db knowledge_graph.json
465
+
466
+# Force a specific direction
467
+planopticon kg sync knowledge_graph.db knowledge_graph.json --direction db-to-json
468
+planopticon kg sync knowledge_graph.db knowledge_graph.json --direction json-to-db
469
+```
470
+
471
+If `JSON_PATH` is omitted, the `.json` path is derived from the `.db` path (same name, different extension). In `auto` mode (the default), the newer file is used as the source.
472
+
473
+### kg inspect
474
+
475
+Show summary statistics for a knowledge graph file:
476
+
477
+```bash
478
+planopticon kg inspect results/knowledge_graph.db
479
+```
480
+
481
+Output:
482
+
483
+```
484
+File: results/knowledge_graph.db
485
+Store: sqlite
486
+Entities: 42
487
+Relationships: 87
488
+Entity types:
489
+ technology: 15
490
+ person: 12
491
+ concept: 10
492
+ organization: 5
493
+```
494
+
495
+Works with both `.db` and `.json` files.
496
+
497
+### kg classify
498
+
499
+Classify knowledge graph entities into planning taxonomy types:
500
+
501
+```bash
502
+# Heuristic + LLM classification
503
+planopticon kg classify results/knowledge_graph.db
504
+
505
+# Heuristic only (no API key needed)
506
+planopticon kg classify results/knowledge_graph.db -p none
507
+
508
+# JSON output
509
+planopticon kg classify results/knowledge_graph.db --format json
510
+```
511
+
512
+Text output groups entities by planning type:
513
+
514
+```
515
+GOALS (3)
516
+ - Improve system reliability [high]
517
+ Must achieve 99.9% uptime
518
+ - Reduce deployment time [medium]
519
+ Automate the deployment pipeline
520
+
521
+RISKS (2)
522
+ - Data migration complexity [high]
523
+ Legacy schema incompatibilities
524
+ ...
525
+
526
+TASKS (5)
527
+ - Implement OAuth2 flow
528
+ Set up authentication service
529
+ ...
530
+```
531
+
532
+JSON output returns an array of `PlanningEntity` objects with `name`, `planning_type`, `priority`, `description`, and `source_entities` fields.
533
+
534
+### kg from-exchange
535
+
536
+Import a PlanOpticonExchange JSON file into a knowledge graph database:
537
+
538
+```bash
539
+# Import to default location (./knowledge_graph.db)
540
+planopticon kg from-exchange exchange.json
541
+
542
+# Import to a specific path
543
+planopticon kg from-exchange exchange.json -o project.db
544
+```
545
+
546
+The PlanOpticonExchange format is a standardized interchange format that includes entities, relationships, and source records.
547
+
548
+---
549
+
550
+## Output Formats
551
+
552
+Query results can be output in three formats:
553
+
554
+### Text (default)
555
+
556
+Human-readable format with entity types in brackets, relationship arrows, and indented details:
557
+
558
+```
559
+Found 15 entities
560
+ [technology] Python -- General-purpose programming language
561
+ [person] Alice -- Lead engineer on the project
562
+ [concept] Microservices -- Architectural pattern discussed
563
+```
564
+
565
+### JSON
566
+
567
+Full structured output including query metadata:
568
+
569
+```bash
570
+planopticon query --format json stats
571
+```
572
+
573
+```json
574
+{
575
+ "query_type": "filter",
576
+ "raw_query": "stats()",
577
+ "explanation": "Knowledge graph statistics",
578
+ "data": {
579
+ "entity_count": 42,
580
+ "relationship_count": 87,
581
+ "entity_types": {
582
+ "technology": 15,
583
+ "person": 12
584
+ }
585
+ }
586
+}
587
+```
588
+
589
+### Mermaid
590
+
591
+Graph results rendered as Mermaid diagram syntax, ready for embedding in markdown:
592
+
593
+```bash
594
+planopticon query --format mermaid "neighbors Alice"
595
+```
596
+
597
+```
598
+graph LR
599
+ Alice["Alice"]:::person
600
+ Python["Python"]:::technology
601
+ Kubernetes["Kubernetes"]:::technology
602
+ Alice -- "expert_in" --> Kubernetes
603
+ Alice -- "works_with" --> Python
604
+ classDef person fill:#f9d5e5,stroke:#333
605
+ classDef concept fill:#eeeeee,stroke:#333
606
+ classDef technology fill:#d5e5f9,stroke:#333
607
+ classDef organization fill:#f9e5d5,stroke:#333
608
+```
609
+
610
+The `KnowledgeGraph.generate_mermaid()` method also produces full-graph Mermaid diagrams, capped at the top 30 most-connected nodes by default.
611
+
612
+---
613
+
614
+## Auto-Discovery
615
+
616
+PlanOpticon automatically locates knowledge graph files using the `find_nearest_graph()` function. The search order is:
617
+
618
+1. **Current directory** -- check for `knowledge_graph.db` and `knowledge_graph.json`
619
+2. **Common subdirectories** -- `results/`, `output/`, `knowledge-base/`
620
+3. **Recursive downward walk** -- up to 4 levels deep, skipping hidden directories
621
+4. **Parent directory walk** -- upward through the directory tree, checking each level and its common subdirectories
622
+
623
+Within each search phase, `.db` files are preferred over `.json` files. Results are sorted by proximity (closest first).
624
+
625
+```python
626
+from video_processor.integrators.graph_discovery import (
627
+ find_nearest_graph,
628
+ find_knowledge_graphs,
629
+ describe_graph,
630
+)
631
+
632
+# Find the single closest knowledge graph
633
+path = find_nearest_graph()
634
+
635
+# Find all knowledge graphs, sorted by proximity
636
+paths = find_knowledge_graphs()
637
+
638
+# Find graphs starting from a specific directory
639
+paths = find_knowledge_graphs(start_dir="/path/to/project")
640
+
641
+# Disable upward walking
642
+paths = find_knowledge_graphs(walk_up=False)
643
+
644
+# Get summary stats without loading the full graph
645
+info = describe_graph(path)
646
+# {"entity_count": 42, "relationship_count": 87,
647
+# "entity_types": {...}, "store_type": "sqlite"}
648
+```
649
+
650
+Auto-discovery is used by the Companion REPL, the `planopticon query` command, and the planning agent when no explicit `--kb` path is provided.
--- a/docs/guide/knowledge-graphs.md
+++ b/docs/guide/knowledge-graphs.md
@@ -0,0 +1,650 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/guide/knowledge-graphs.md
+++ b/docs/guide/knowledge-graphs.md
@@ -0,0 +1,650 @@
1 # Knowledge Graphs
2
3 PlanOpticon builds structured knowledge graphs from video analyses, document ingestion, and other content sources. A knowledge graph captures **entities** (people, technologies, concepts, organizations) and the **relationships** between them, providing a queryable representation of everything discussed or presented in your source material.
4
5 ---
6
7 ## Storage
8
9 Knowledge graphs are stored as SQLite databases (`knowledge_graph.db`) using Python's built-in `sqlite3` module. This means:
10
11 - **Zero external dependencies.** No database server to install or manage.
12 - **Single-file portability.** Copy the `.db` file to share a knowledge graph.
13 - **WAL mode.** SQLite Write-Ahead Logging is enabled for concurrent read performance.
14 - **JSON fallback.** Knowledge graphs can also be saved as `knowledge_graph.json` for interoperability, though SQLite is preferred for performance and querying.
15
16 ### Database Schema
17
18 The SQLite store uses the following tables:
19
20 | Table | Purpose |
21 |---|---|
22 | `entities` | Core entity records with name, type, descriptions, source, and arbitrary properties |
23 | `occurrences` | Where and when each entity was mentioned (source, timestamp, text snippet) |
24 | `relationships` | Directed edges between entities with type, content source, timestamp, and properties |
25 | `sources` | Registered content sources with provenance metadata (source type, title, path, URL, MIME type, ingestion timestamp) |
26 | `source_locations` | Links between sources and specific entities/relationships, with location details (timestamp, page, section, line range, text snippet) |
27
28 All entity lookups are case-insensitive (indexed on `name_lower`). Entities and relationships are indexed on their source and target fields for efficient traversal.
29
30 ### Storage Backends
31
32 PlanOpticon supports two storage backends, selected automatically:
33
34 | Backend | When Used | Persistence |
35 |---|---|---|
36 | `SQLiteStore` | When a `db_path` is provided | Persistent on disk |
37 | `InMemoryStore` | When no path is given, or as fallback | In-memory only |
38
39 Both backends implement the same `GraphStore` abstract interface, so all query and manipulation code works identically regardless of backend.
40
41 ```python
42 from video_processor.integrators.graph_store import create_store
43
44 # Persistent SQLite store
45 store = create_store("/path/to/knowledge_graph.db")
46
47 # In-memory store (for temporary operations)
48 store = create_store()
49 ```
50
51 ---
52
53 ## Entity Types
54
55 Entities extracted from content are assigned one of the following base types:
56
57 | Type | Description | Specificity Rank |
58 |---|---|---|
59 | `person` | People mentioned or participating | 3 (highest) |
60 | `technology` | Tools, languages, frameworks, platforms | 3 |
61 | `organization` | Companies, teams, departments | 2 |
62 | `time` | Dates, deadlines, time references | 1 |
63 | `diagram` | Visual diagrams extracted from video frames | 1 |
64 | `concept` | General concepts, topics, ideas (default) | 0 (lowest) |
65
66 The specificity rank is used during merge operations: when two entities are matched as duplicates, the more specific type wins (e.g., `technology` overrides `concept`).
67
68 ### Planning Taxonomy
69
70 Beyond the base entity types, PlanOpticon includes a planning taxonomy for classifying entities into project-planning categories. The `TaxonomyClassifier` maps extracted entities into these types:
71
72 | Planning Type | Keywords Matched |
73 |---|---|
74 | `goal` | goal, objective, aim, target outcome |
75 | `requirement` | must, should, requirement, need, required |
76 | `constraint` | constraint, limitation, restrict, cannot, must not |
77 | `decision` | decided, decision, chose, selected, agreed |
78 | `risk` | risk, concern, worry, danger, threat |
79 | `assumption` | assume, assumption, expecting, presume |
80 | `dependency` | depends, dependency, relies on, prerequisite, blocked |
81 | `milestone` | milestone, deadline, deliverable, release, launch |
82 | `task` | task, todo, action item, work item, implement |
83 | `feature` | feature, capability, functionality |
84
85 Classification works in two stages:
86
87 1. **Heuristic classification.** Entity descriptions are scanned for the keywords listed above. First match wins.
88 2. **LLM refinement.** If an LLM provider is available, entities are sent to the LLM for more nuanced classification with priority assignment (`high`, `medium`, `low`). LLM results override heuristic results on conflicts.
89
90 Classified entities are used by planning agent skills (project_plan, prd, roadmap, task_breakdown) to produce targeted, context-aware artifacts.
91
92 ---
93
94 ## Relationship Types
95
96 Relationships are directed edges between entities. The `type` field is a free-text string determined by the LLM during extraction. Common relationship types include:
97
98 - `related_to` (default)
99 - `works_with`
100 - `uses`
101 - `depends_on`
102 - `proposed`
103 - `discussed_by`
104 - `employed_by`
105 - `collaborates_with`
106 - `expert_in`
107
108 ### Typed Relationships
109
110 The `add_typed_relationship()` method creates edges with custom labels and optional properties, enabling richer graph semantics:
111
112 ```python
113 store.add_typed_relationship(
114 source="Authentication Service",
115 target="PostgreSQL",
116 edge_label="USES_SYSTEM",
117 properties={"purpose": "user credential storage", "version": "15"},
118 )
119 ```
120
121 ### Relationship Checks
122
123 You can check whether a relationship exists between two entities:
124
125 ```python
126 # Check for any relationship
127 store.has_relationship("Alice", "Kubernetes")
128
129 # Check for a specific relationship type
130 store.has_relationship("Alice", "Kubernetes", edge_label="expert_in")
131 ```
132
133 ---
134
135 ## Building a Knowledge Graph
136
137 ### From Video Analysis
138
139 The primary path for building a knowledge graph is through video analysis. When you run `planopticon analyze`, the pipeline extracts entities and relationships from:
140
141 - **Transcript segments** -- batched in groups of 10 for efficient API usage, with speaker identification
142 - **Diagram content** -- text extracted from visual diagrams detected in video frames
143
144 ```bash
145 planopticon analyze -i meeting.mp4 -o results/
146 # Creates results/knowledge_graph.db
147 ```
148
149 ### From Document Ingestion
150
151 Documents (Markdown, PDF, DOCX) can be ingested directly into a knowledge graph:
152
153 ```bash
154 # Ingest a single file
155 planopticon ingest -i requirements.pdf -o results/
156
157 # Ingest a directory recursively
158 planopticon ingest -i docs/ -o results/ --recursive
159
160 # Ingest into an existing knowledge graph
161 planopticon ingest -i notes.md --db results/knowledge_graph.db
162 ```
163
164 ### From Batch Processing
165
166 Multiple videos can be processed in batch mode, with all results merged into a single knowledge graph:
167
168 ```bash
169 planopticon batch -i videos/ -o results/
170 ```
171
172 ### Programmatic Construction
173
174 ```python
175 from video_processor.integrators.knowledge_graph import KnowledgeGraph
176
177 # Create a new knowledge graph with LLM extraction
178 from video_processor.providers.manager import ProviderManager
179 pm = ProviderManager()
180 kg = KnowledgeGraph(provider_manager=pm, db_path="knowledge_graph.db")
181
182 # Add content (entities and relationships are extracted by LLM)
183 kg.add_content(
184 text="Alice proposed using Kubernetes for container orchestration.",
185 source="meeting_notes",
186 timestamp=120.5,
187 )
188
189 # Process a full transcript
190 kg.process_transcript(transcript_data, batch_size=10)
191
192 # Process diagram results
193 kg.process_diagrams(diagram_results)
194
195 # Save
196 kg.save("knowledge_graph.db")
197 ```
198
199 ---
200
201 ## Merge and Deduplication
202
203 When combining knowledge graphs from multiple sources, PlanOpticon performs intelligent merge with deduplication.
204
205 ### Fuzzy Name Matching
206
207 Entity names are compared using Python's `SequenceMatcher` with a threshold of 0.85. This means "Kubernetes" and "kubernetes" are matched exactly (case-insensitive), while "React.js" and "ReactJS" may be matched as duplicates if their similarity ratio meets the threshold.
208
209 ### Type Conflict Resolution
210
211 When two entities match but have different types, the more specific type wins based on the specificity ranking:
212
213 | Scenario | Result |
214 |---|---|
215 | `concept` vs `technology` | `technology` wins (rank 3 > rank 0) |
216 | `person` vs `concept` | `person` wins (rank 3 > rank 0) |
217 | `organization` vs `concept` | `organization` wins (rank 2 > rank 0) |
218 | `person` vs `technology` | Keeps whichever was first (equal rank) |
219
220 ### Provenance Tracking
221
222 Merged entities receive a `merged_from:<original_name>` description entry, preserving the audit trail of which entities were unified.
223
224 ### Programmatic Merge
225
226 ```python
227 from video_processor.integrators.knowledge_graph import KnowledgeGraph
228
229 # Load two knowledge graphs
230 kg1 = KnowledgeGraph(db_path="project_a.db")
231 kg2 = KnowledgeGraph(db_path="project_b.db")
232
233 # Merge kg2 into kg1
234 kg1.merge(kg2)
235
236 # Save the merged result
237 kg1.save("merged.db")
238 ```
239
240 The merge operation also copies all registered sources and occurrences, so provenance information is preserved across merges.
241
242 ---
243
244 ## Querying
245
246 PlanOpticon provides two query modes: direct mode (no LLM required) and agentic mode (LLM-powered natural language).
247
248 ### Direct Mode
249
250 Direct mode queries are fast, deterministic, and require no API key. They are the right choice for structured lookups.
251
252 #### Stats
253
254 Return entity count, relationship count, and entity type breakdown:
255
256 ```bash
257 planopticon query
258 ```
259
260 ```python
261 engine.stats()
262 # QueryResult with data: {
263 # "entity_count": 42,
264 # "relationship_count": 87,
265 # "entity_types": {"technology": 15, "person": 12, ...}
266 # }
267 ```
268
269 #### Entities
270
271 Filter entities by name substring and/or type:
272
273 ```bash
274 planopticon query "entities --type technology"
275 planopticon query "entities --name python"
276 ```
277
278 ```python
279 engine.entities(entity_type="technology")
280 engine.entities(name="python")
281 engine.entities(name="auth", entity_type="concept", limit=10)
282 ```
283
284 All filtering is case-insensitive. Results are capped at 50 by default (configurable via `limit`).
285
286 #### Neighbors
287
288 Get an entity and all directly connected nodes and relationships:
289
290 ```bash
291 planopticon query "neighbors Alice"
292 ```
293
294 ```python
295 engine.neighbors("Alice", depth=1)
296 ```
297
298 The `depth` parameter controls how many hops to traverse (default 1). The result includes both entity objects and relationship objects.
299
300 #### Relationships
301
302 Filter relationships by source, target, and/or type:
303
304 ```bash
305 planopticon query "relationships --source Alice"
306 ```
307
308 ```python
309 engine.relationships(source="Alice")
310 engine.relationships(target="Kubernetes", rel_type="uses")
311 ```
312
313 #### Sources
314
315 List all registered content sources:
316
317 ```python
318 engine.sources()
319 ```
320
321 #### Provenance
322
323 Get all source locations for a specific entity, showing exactly where it was mentioned:
324
325 ```python
326 engine.provenance("Kubernetes")
327 # Returns source locations with timestamps, pages, sections, and text snippets
328 ```
329
330 #### Raw SQL
331
332 Execute arbitrary SQL against the SQLite backend (SQLite stores only):
333
334 ```python
335 engine.sql("SELECT name, type FROM entities WHERE type = 'technology' ORDER BY name")
336 ```
337
338 ### Agentic Mode
339
340 Agentic mode accepts natural-language questions and uses the LLM to plan and execute queries. It requires a configured LLM provider.
341
342 ```bash
343 planopticon query "What technologies were discussed?"
344 planopticon query "Who are the key people mentioned?"
345 planopticon query "What depends on the authentication service?"
346 ```
347
348 The agentic query pipeline:
349
350 1. **Plan.** The LLM receives graph stats and available actions (entities, relationships, neighbors, stats). It selects exactly one action and its parameters.
351 2. **Execute.** The chosen action is run through the direct-mode engine.
352 3. **Synthesize.** The LLM receives the raw query results and the original question, then produces a concise natural-language answer.
353
354 This design ensures the LLM never generates arbitrary code -- it only selects from a fixed set of known query actions.
355
356 ```bash
357 # Requires an API key
358 planopticon query "What technologies were discussed?" -p openai
359
360 # Use the interactive REPL for multiple queries
361 planopticon query -I
362 ```
363
364 ---
365
366 ## Graph Query Engine Python API
367
368 The `GraphQueryEngine` class provides the programmatic interface for all query operations.
369
370 ### Initialization
371
372 ```python
373 from video_processor.integrators.graph_query import GraphQueryEngine
374 from video_processor.integrators.graph_discovery import find_nearest_graph
375
376 # From a .db file
377 path = find_nearest_graph()
378 engine = GraphQueryEngine.from_db_path(path)
379
380 # From a .json file
381 engine = GraphQueryEngine.from_json_path("knowledge_graph.json")
382
383 # With an LLM provider for agentic mode
384 from video_processor.providers.manager import ProviderManager
385 pm = ProviderManager()
386 engine = GraphQueryEngine.from_db_path(path, provider_manager=pm)
387 ```
388
389 ### QueryResult
390
391 All query methods return a `QueryResult` dataclass with multiple output formats:
392
393 ```python
394 result = engine.stats()
395
396 # Human-readable text
397 print(result.to_text())
398
399 # JSON string
400 print(result.to_json())
401
402 # Mermaid diagram (for graph results)
403 result = engine.neighbors("Alice")
404 print(result.to_mermaid())
405 ```
406
407 The `QueryResult` contains:
408
409 | Field | Type | Description |
410 |---|---|---|
411 | `data` | Any | The raw result data (dict, list, or scalar) |
412 | `query_type` | str | `"filter"` for direct mode, `"agentic"` for LLM mode, `"sql"` for raw SQL |
413 | `raw_query` | str | String representation of the executed query |
414 | `explanation` | str | Human-readable explanation or LLM-synthesized answer |
415
416 ---
417
418 ## The Self-Contained HTML Viewer
419
420 PlanOpticon includes a zero-dependency HTML knowledge graph viewer at `knowledge-base/viewer.html`. This file is fully self-contained -- it inlines D3.js and requires no build step, no server, and no internet connection.
421
422 To use it, open `viewer.html` in a browser. It will load and visualize a `knowledge_graph.json` file (place it in the same directory, or use the file picker in the viewer).
423
424 The viewer provides:
425
426 - Interactive force-directed graph layout
427 - Zoom and pan navigation
428 - Entity nodes colored by type
429 - Relationship edges with labels
430 - Click-to-focus on individual entities
431 - Entity detail panel showing descriptions and connections
432
433 This covers approximately 80% of graph exploration needs with zero infrastructure.
434
435 ---
436
437 ## KG Management Commands
438
439 The `planopticon kg` command group provides utilities for managing knowledge graph files.
440
441 ### kg convert
442
443 Convert a knowledge graph between SQLite and JSON formats:
444
445 ```bash
446 # SQLite to JSON
447 planopticon kg convert results/knowledge_graph.db output.json
448
449 # JSON to SQLite
450 planopticon kg convert knowledge_graph.json knowledge_graph.db
451 ```
452
453 The output format is inferred from the destination file extension. Source and destination must be different formats.
454
455 ### kg sync
456
457 Synchronize a `.db` and `.json` knowledge graph, updating the stale one:
458
459 ```bash
460 # Auto-detect which is newer and sync
461 planopticon kg sync results/knowledge_graph.db
462
463 # Explicit JSON path
464 planopticon kg sync knowledge_graph.db knowledge_graph.json
465
466 # Force a specific direction
467 planopticon kg sync knowledge_graph.db knowledge_graph.json --direction db-to-json
468 planopticon kg sync knowledge_graph.db knowledge_graph.json --direction json-to-db
469 ```
470
471 If `JSON_PATH` is omitted, the `.json` path is derived from the `.db` path (same name, different extension). In `auto` mode (the default), the newer file is used as the source.
472
473 ### kg inspect
474
475 Show summary statistics for a knowledge graph file:
476
477 ```bash
478 planopticon kg inspect results/knowledge_graph.db
479 ```
480
481 Output:
482
483 ```
484 File: results/knowledge_graph.db
485 Store: sqlite
486 Entities: 42
487 Relationships: 87
488 Entity types:
489 technology: 15
490 person: 12
491 concept: 10
492 organization: 5
493 ```
494
495 Works with both `.db` and `.json` files.
496
497 ### kg classify
498
499 Classify knowledge graph entities into planning taxonomy types:
500
501 ```bash
502 # Heuristic + LLM classification
503 planopticon kg classify results/knowledge_graph.db
504
505 # Heuristic only (no API key needed)
506 planopticon kg classify results/knowledge_graph.db -p none
507
508 # JSON output
509 planopticon kg classify results/knowledge_graph.db --format json
510 ```
511
512 Text output groups entities by planning type:
513
514 ```
515 GOALS (3)
516 - Improve system reliability [high]
517 Must achieve 99.9% uptime
518 - Reduce deployment time [medium]
519 Automate the deployment pipeline
520
521 RISKS (2)
522 - Data migration complexity [high]
523 Legacy schema incompatibilities
524 ...
525
526 TASKS (5)
527 - Implement OAuth2 flow
528 Set up authentication service
529 ...
530 ```
531
532 JSON output returns an array of `PlanningEntity` objects with `name`, `planning_type`, `priority`, `description`, and `source_entities` fields.
533
534 ### kg from-exchange
535
536 Import a PlanOpticonExchange JSON file into a knowledge graph database:
537
538 ```bash
539 # Import to default location (./knowledge_graph.db)
540 planopticon kg from-exchange exchange.json
541
542 # Import to a specific path
543 planopticon kg from-exchange exchange.json -o project.db
544 ```
545
546 The PlanOpticonExchange format is a standardized interchange format that includes entities, relationships, and source records.
547
548 ---
549
550 ## Output Formats
551
552 Query results can be output in three formats:
553
554 ### Text (default)
555
556 Human-readable format with entity types in brackets, relationship arrows, and indented details:
557
558 ```
559 Found 15 entities
560 [technology] Python -- General-purpose programming language
561 [person] Alice -- Lead engineer on the project
562 [concept] Microservices -- Architectural pattern discussed
563 ```
564
565 ### JSON
566
567 Full structured output including query metadata:
568
569 ```bash
570 planopticon query --format json stats
571 ```
572
573 ```json
574 {
575 "query_type": "filter",
576 "raw_query": "stats()",
577 "explanation": "Knowledge graph statistics",
578 "data": {
579 "entity_count": 42,
580 "relationship_count": 87,
581 "entity_types": {
582 "technology": 15,
583 "person": 12
584 }
585 }
586 }
587 ```
588
589 ### Mermaid
590
591 Graph results rendered as Mermaid diagram syntax, ready for embedding in markdown:
592
593 ```bash
594 planopticon query --format mermaid "neighbors Alice"
595 ```
596
597 ```
598 graph LR
599 Alice["Alice"]:::person
600 Python["Python"]:::technology
601 Kubernetes["Kubernetes"]:::technology
602 Alice -- "expert_in" --> Kubernetes
603 Alice -- "works_with" --> Python
604 classDef person fill:#f9d5e5,stroke:#333
605 classDef concept fill:#eeeeee,stroke:#333
606 classDef technology fill:#d5e5f9,stroke:#333
607 classDef organization fill:#f9e5d5,stroke:#333
608 ```
609
610 The `KnowledgeGraph.generate_mermaid()` method also produces full-graph Mermaid diagrams, capped at the top 30 most-connected nodes by default.
611
612 ---
613
614 ## Auto-Discovery
615
616 PlanOpticon automatically locates knowledge graph files using the `find_nearest_graph()` function. The search order is:
617
618 1. **Current directory** -- check for `knowledge_graph.db` and `knowledge_graph.json`
619 2. **Common subdirectories** -- `results/`, `output/`, `knowledge-base/`
620 3. **Recursive downward walk** -- up to 4 levels deep, skipping hidden directories
621 4. **Parent directory walk** -- upward through the directory tree, checking each level and its common subdirectories
622
623 Within each search phase, `.db` files are preferred over `.json` files. Results are sorted by proximity (closest first).
624
625 ```python
626 from video_processor.integrators.graph_discovery import (
627 find_nearest_graph,
628 find_knowledge_graphs,
629 describe_graph,
630 )
631
632 # Find the single closest knowledge graph
633 path = find_nearest_graph()
634
635 # Find all knowledge graphs, sorted by proximity
636 paths = find_knowledge_graphs()
637
638 # Find graphs starting from a specific directory
639 paths = find_knowledge_graphs(start_dir="/path/to/project")
640
641 # Disable upward walking
642 paths = find_knowledge_graphs(walk_up=False)
643
644 # Get summary stats without loading the full graph
645 info = describe_graph(path)
646 # {"entity_count": 42, "relationship_count": 87,
647 # "entity_types": {...}, "store_type": "sqlite"}
648 ```
649
650 Auto-discovery is used by the Companion REPL, the `planopticon query` command, and the planning agent when no explicit `--kb` path is provided.
--- docs/guide/output-formats.md
+++ docs/guide/output-formats.md
@@ -1,47 +1,329 @@
11
# Output Formats
22
3
-PlanOpticon produces multiple output formats from each analysis run.
3
+PlanOpticon produces a wide range of output formats from video analysis, document ingestion, batch processing, knowledge graph export, and agent skills. This page is the comprehensive reference for every format the tool can emit.
4
+
5
+---
46
57
## Transcripts
68
9
+Video analysis always produces transcripts in three formats, stored in the `transcript/` subdirectory of the output folder.
10
+
711
| Format | File | Description |
812
|--------|------|-------------|
9
-| JSON | `transcript/transcript.json` | Full transcript with segments, timestamps, speakers |
10
-| Text | `transcript/transcript.txt` | Plain text transcript |
11
-| SRT | `transcript/transcript.srt` | Subtitle format with timestamps |
13
+| JSON | `transcript/transcript.json` | Full transcript with segments, timestamps, speaker labels, and confidence scores. Each segment includes `start`, `end`, `text`, and optional `speaker` fields. |
14
+| Text | `transcript/transcript.txt` | Plain text transcript with no metadata. Suitable for feeding into other tools or reading directly. |
15
+| SRT | `transcript/transcript.srt` | SubRip subtitle format with sequential numbering and `HH:MM:SS,mmm` timestamps. Can be loaded into video players or subtitle editors. |
16
+
17
+### Transcript JSON structure
18
+
19
+```json
20
+{
21
+ "segments": [
22
+ {
23
+ "start": 0.0,
24
+ "end": 4.5,
25
+ "text": "Welcome to the sprint review.",
26
+ "speaker": "Alice"
27
+ }
28
+ ],
29
+ "text": "Welcome to the sprint review. ...",
30
+ "language": "en"
31
+}
32
+```
33
+
34
+When the `--speakers` flag is provided (e.g., `--speakers "Alice,Bob,Carol"`), speaker diarization hints are passed to the transcription provider and speaker labels appear in the JSON segments.
35
+
36
+---
1237
1338
## Reports
1439
40
+Analysis reports are generated from the combined transcript, diagrams, key points, action items, and knowledge graph. They live in the `results/` subdirectory.
41
+
1542
| Format | File | Description |
1643
|--------|------|-------------|
17
-| Markdown | `results/analysis.md` | Structured report with diagrams |
18
-| HTML | `results/analysis.html` | Self-contained HTML with mermaid.js |
19
-| PDF | `results/analysis.pdf` | Print-ready PDF (requires `planopticon[pdf]`) |
44
+| Markdown | `results/analysis.md` | Structured report with embedded Mermaid diagram blocks, tables, and cross-references. Works in any Markdown renderer. |
45
+| HTML | `results/analysis.html` | Self-contained HTML page with inline CSS, embedded SVG diagrams, and a bundled mermaid.js script for rendering any unrendered Mermaid blocks. No external dependencies required to view. |
46
+| PDF | `results/analysis.pdf` | Print-ready PDF. Requires the `planopticon[pdf]` extra (`pip install planopticon[pdf]`). Generated from the HTML report. |
47
+
48
+---
2049
2150
## Diagrams
2251
23
-Each detected diagram produces:
52
+Each visual element detected during frame analysis produces up to five output files in the `diagrams/` subdirectory. The index `N` is zero-based.
2453
2554
| Format | File | Description |
2655
|--------|------|-------------|
27
-| JPEG | `diagrams/diagram_N.jpg` | Original frame |
28
-| Mermaid | `diagrams/diagram_N.mermaid` | Mermaid source code |
29
-| SVG | `diagrams/diagram_N.svg` | Vector rendering |
30
-| PNG | `diagrams/diagram_N.png` | Raster rendering |
31
-| JSON | `diagrams/diagram_N.json` | Structured analysis data |
56
+| JPEG | `diagrams/diagram_N.jpg` | Original video frame captured at the point of detection. |
57
+| Mermaid | `diagrams/diagram_N.mermaid` | Mermaid source code reconstructed from the diagram by the vision model. Supports flowcharts, sequence diagrams, architecture diagrams, and more. |
58
+| SVG | `diagrams/diagram_N.svg` | Vector rendering of the Mermaid source, produced by the Mermaid CLI or built-in renderer. |
59
+| PNG | `diagrams/diagram_N.png` | Raster rendering of the Mermaid source at high resolution. |
60
+| JSON | `diagrams/diagram_N.json` | Structured analysis data including diagram type, description, extracted text, chart data (if applicable), and confidence score. |
61
+
62
+Frames that score as medium confidence are saved as captioned screenshots in the `captures/` subdirectory instead, with a `capture_N.jpg` and `capture_N.json` pair.
63
+
64
+---
65
+
66
+## Structured Data
3267
33
-## Structured data
68
+Core analysis artifacts are stored as JSON files in the `results/` subdirectory.
3469
3570
| Format | File | Description |
3671
|--------|------|-------------|
37
-| JSON | `results/knowledge_graph.json` | Entities and relationships |
38
-| JSON | `results/key_points.json` | Extracted key points |
39
-| JSON | `results/action_items.json` | Action items with assignees |
40
-| JSON | `manifest.json` | Complete run manifest |
72
+| SQLite | `results/knowledge_graph.db` | Primary knowledge graph database. SQLite-based, queryable with `planopticon query`. Contains entities, relationships, source provenance, and metadata. This is the preferred format for querying and merging. |
73
+| JSON | `results/knowledge_graph.json` | JSON export of the knowledge graph. Contains `entities` and `relationships` arrays. Automatically kept in sync with the `.db` file. Used as a fallback when SQLite is not available. |
74
+| JSON | `results/key_points.json` | Array of extracted key points, each with `text`, `category`, and `confidence` fields. |
75
+| JSON | `results/action_items.json` | Array of action items, each with `text`, `assignee`, `due_date`, `priority`, and `status` fields. |
76
+| JSON | `manifest.json` | Complete run manifest. The single source of truth for the analysis run. Contains video metadata, processing stats, file paths to all outputs, and inline key points, action items, diagram metadata, and screen captures. |
77
+
78
+### Knowledge graph JSON structure
79
+
80
+```json
81
+{
82
+ "entities": [
83
+ {
84
+ "name": "Kubernetes",
85
+ "type": "technology",
86
+ "descriptions": ["Container orchestration platform discussed in architecture review"],
87
+ "occurrences": [
88
+ {"source": "video:recording.mp4", "timestamp": "00:05:23"}
89
+ ]
90
+ }
91
+ ],
92
+ "relationships": [
93
+ {
94
+ "source": "Kubernetes",
95
+ "target": "Docker",
96
+ "type": "DEPENDS_ON",
97
+ "descriptions": ["Kubernetes uses Docker as container runtime"]
98
+ }
99
+ ]
100
+}
101
+```
102
+
103
+---
41104
42105
## Charts
43106
44
-When chart data is extracted from diagrams (bar, line, pie, scatter), PlanOpticon reproduces them:
107
+When chart data is extracted from diagrams (bar charts, line charts, pie charts, scatter plots), PlanOpticon reproduces them as standalone image files.
108
+
109
+| Format | File | Description |
110
+|--------|------|-------------|
111
+| SVG | `diagrams/chart_N.svg` | Vector chart rendered via matplotlib. Suitable for embedding in documents or scaling to any size. |
112
+| PNG | `diagrams/chart_N.png` | Raster chart rendered via matplotlib at 150 DPI. |
113
+
114
+Reproduced charts are also embedded inline in the HTML and PDF reports.
115
+
116
+---
117
+
118
+## Knowledge Graph Exports
119
+
120
+Beyond the default `knowledge_graph.db` and `knowledge_graph.json` produced during analysis, PlanOpticon supports exporting knowledge graphs to several additional formats via the `planopticon export` and `planopticon kg convert` commands.
121
+
122
+| Format | Command / File | Description |
123
+|--------|---------------|-------------|
124
+| JSON | `knowledge_graph.json` | Default JSON export. Produced automatically alongside the `.db` file. |
125
+| SQLite | `knowledge_graph.db` | Primary database format. Can be converted to/from JSON with `planopticon kg convert`. |
126
+| GraphML | `output.graphml` | XML-based graph format via `planopticon kg convert kg.db output.graphml`. Compatible with Gephi, yEd, Cytoscape, and other graph visualization tools. |
127
+| CSV | `export/entities.csv`, `export/relationships.csv` | Tabular export via `planopticon export markdown kg.db --type csv`. Produces separate CSV files for entities and relationships. |
128
+| Mermaid | Inline in reports | Mermaid graph diagrams are embedded in Markdown and HTML reports. Also available programmatically via `GraphQueryEngine.to_mermaid()`. |
129
+
130
+### Converting between formats
131
+
132
+```bash
133
+# SQLite to JSON
134
+planopticon kg convert results/knowledge_graph.db output.json
135
+
136
+# JSON to SQLite
137
+planopticon kg convert knowledge_graph.json knowledge_graph.db
138
+
139
+# Sync both directions (updates the stale file)
140
+planopticon kg sync results/knowledge_graph.db
141
+planopticon kg sync knowledge_graph.db knowledge_graph.json --direction db-to-json
142
+```
143
+
144
+---
145
+
146
+## PlanOpticonExchange Format
147
+
148
+The PlanOpticonExchange format (`.json`) is a canonical interchange payload designed for sharing knowledge graphs between PlanOpticon instances, teams, or external systems.
149
+
150
+```bash
151
+planopticon export exchange knowledge_graph.db
152
+planopticon export exchange kg.db -o exchange.json --name "My Project"
153
+```
154
+
155
+The exchange payload includes:
156
+
157
+- **Schema version** for forward compatibility
158
+- **Project metadata** (name, description)
159
+- **Full entity and relationship data** with provenance
160
+- **Source tracking** for multi-source graphs
161
+- **Merge support** -- exchange files can be merged together, deduplicating entities by name
162
+
163
+### Exchange JSON structure
164
+
165
+```json
166
+{
167
+ "schema_version": "1.0",
168
+ "project": {
169
+ "name": "Sprint Reviews Q4",
170
+ "description": "Knowledge extracted from Q4 sprint review recordings"
171
+ },
172
+ "entities": [...],
173
+ "relationships": [...],
174
+ "sources": [...]
175
+}
176
+```
177
+
178
+---
179
+
180
+## Document Exports
181
+
182
+PlanOpticon can generate structured Markdown documents from any knowledge graph, with no API key required. These are pure template-based outputs derived from the graph data.
183
+
184
+### Markdown document types
185
+
186
+There are seven document types plus a CSV export, all generated via `planopticon export markdown`:
187
+
188
+| Type | File | Description |
189
+|------|------|-------------|
190
+| `summary` | `executive_summary.md` | High-level executive summary with entity counts, top relationships, and key themes. |
191
+| `meeting-notes` | `meeting_notes.md` | Structured meeting notes with attendees, topics discussed, decisions made, and action items. |
192
+| `glossary` | `glossary.md` | Alphabetical glossary of all entities with descriptions and types. |
193
+| `relationship-map` | `relationship_map.md` | Textual and Mermaid-based relationship map showing how entities connect. |
194
+| `status-report` | `status_report.md` | Status report format with progress indicators, risks, and next steps. |
195
+| `entity-index` | `entity_index.md` | Comprehensive index of all entities grouped by type, with links to individual briefs. |
196
+| `entity-brief` | `entities/<Name>.md` | One-pager brief for each entity, showing descriptions, relationships, and source references. |
197
+| `csv` | `entities.csv` | Tabular CSV export of entities and relationships. |
198
+
199
+```bash
200
+# Generate all document types
201
+planopticon export markdown knowledge_graph.db
202
+
203
+# Generate specific types
204
+planopticon export markdown kg.db -o ./docs --type summary --type glossary
205
+
206
+# Generate meeting notes and CSV
207
+planopticon export markdown kg.db --type meeting-notes --type csv
208
+```
209
+
210
+### Obsidian vault export
211
+
212
+Exports the knowledge graph as an Obsidian-compatible vault with YAML frontmatter, `[[wiki-links]]` between entities, and proper folder structure.
213
+
214
+```bash
215
+planopticon export obsidian knowledge_graph.db -o ./my-vault
216
+```
217
+
218
+The vault includes:
219
+
220
+- One note per entity with frontmatter (`type`, `aliases`, `tags`)
221
+- Wiki-links between related entities
222
+- A `_Index.md` file for navigation
223
+- Compatible with Obsidian graph view
224
+
225
+### Notion markdown export
226
+
227
+Exports as Notion-compatible Markdown with a CSV database file for import into Notion databases.
228
+
229
+```bash
230
+planopticon export notion knowledge_graph.db -o ./notion-export
231
+```
232
+
233
+### GitHub wiki export
234
+
235
+Generates a complete GitHub wiki with a sidebar, home page, and per-entity pages. Can be pushed directly to a GitHub wiki repository.
236
+
237
+```bash
238
+# Generate wiki pages
239
+planopticon wiki generate knowledge_graph.db -o ./wiki
240
+
241
+# Push to GitHub
242
+planopticon wiki push ./wiki ConflictHQ/PlanOpticon -m "Update wiki from KG"
243
+```
244
+
245
+---
246
+
247
+## Batch Outputs
248
+
249
+Batch processing produces additional files at the batch root directory, alongside per-video output folders.
250
+
251
+| Format | File | Description |
252
+|--------|------|-------------|
253
+| JSON | `batch_manifest.json` | Batch-level manifest with aggregate stats, per-video status (completed/failed), error details, and paths to all sub-outputs. |
254
+| Markdown | `batch_summary.md` | Aggregated summary report with combined key points, action items, entity counts, and a Mermaid diagram of the merged knowledge graph. |
255
+| SQLite | `knowledge_graph.db` | Merged knowledge graph combining entities and relationships across all successfully processed videos. Uses fuzzy matching and conflict resolution. |
256
+| JSON | `knowledge_graph.json` | JSON export of the merged knowledge graph. |
257
+
258
+---
259
+
260
+## Self-Contained HTML Viewer
261
+
262
+PlanOpticon ships with a self-contained interactive knowledge graph viewer at `knowledge-base/viewer.html` in the repository. This file:
263
+
264
+- Uses D3.js (bundled inline, no CDN dependency)
265
+- Renders an interactive force-directed graph visualization
266
+- Supports node filtering by entity type
267
+- Shows entity details and relationships on click
268
+- Can load any `knowledge_graph.json` file
269
+- Works offline with no server required -- just open in a browser
270
+- Covers approximately 80% of graph exploration needs with zero infrastructure
271
+
272
+---
273
+
274
+## Output Directory Structure
275
+
276
+A complete single-video analysis produces the following directory tree:
277
+
278
+```
279
+output/
280
+├── manifest.json # Run manifest (source of truth)
281
+├── transcript/
282
+│ ├── transcript.json # Full transcript with segments
283
+│ ├── transcript.txt # Plain text
284
+│ └── transcript.srt # Subtitles
285
+├── frames/
286
+│ ├── frame_0000.jpg # Extracted video frames
287
+│ ├── frame_0001.jpg
288
+│ └── ...
289
+├── diagrams/
290
+│ ├── diagram_0.jpg # Original frame
291
+│ ├── diagram_0.mermaid # Mermaid source
292
+│ ├── diagram_0.svg # Vector rendering
293
+│ ├── diagram_0.png # Raster rendering
294
+│ ├── diagram_0.json # Analysis data
295
+│ ├── chart_0.svg # Reproduced chart (SVG)
296
+│ ├── chart_0.png # Reproduced chart (PNG)
297
+│ └── ...
298
+├── captures/
299
+│ ├── capture_0.jpg # Medium-confidence screenshots
300
+│ ├── capture_0.json # Caption and metadata
301
+│ └── ...
302
+└── results/
303
+ ├── analysis.md # Markdown report
304
+ ├── analysis.html # HTML report
305
+ ├── analysis.pdf # PDF report (if planopticon[pdf] installed)
306
+ ├── knowledge_graph.db # Knowledge graph (SQLite, primary)
307
+ ├── knowledge_graph.json # Knowledge graph (JSON export)
308
+ ├── key_points.json # Extracted key points
309
+ └── action_items.json # Action items
310
+```
311
+
312
+---
313
+
314
+## Controlling Output Format
315
+
316
+Use the `--output-format` flag with `planopticon analyze` to control how results are presented:
317
+
318
+| Value | Behavior |
319
+|-------|----------|
320
+| `default` | Writes all output files to disk and prints a usage summary to stdout. |
321
+| `json` | Writes all output files to disk and also emits the complete `VideoManifest` as structured JSON to stdout. Useful for piping into other tools or CI/CD pipelines. |
322
+
323
+```bash
324
+# Standard output (files + console summary)
325
+planopticon analyze -i video.mp4 -o ./output
45326
46
-- SVG + PNG via matplotlib
47
-- Embedded in HTML/PDF reports
327
+# JSON manifest to stdout (for scripting)
328
+planopticon analyze -i video.mp4 -o ./output --output-format json
329
+```
48330
49331
ADDED docs/guide/planning-agent.md
--- docs/guide/output-formats.md
+++ docs/guide/output-formats.md
@@ -1,47 +1,329 @@
1 # Output Formats
2
3 PlanOpticon produces multiple output formats from each analysis run.
 
 
4
5 ## Transcripts
6
 
 
7 | Format | File | Description |
8 |--------|------|-------------|
9 | JSON | `transcript/transcript.json` | Full transcript with segments, timestamps, speakers |
10 | Text | `transcript/transcript.txt` | Plain text transcript |
11 | SRT | `transcript/transcript.srt` | Subtitle format with timestamps |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
13 ## Reports
14
 
 
15 | Format | File | Description |
16 |--------|------|-------------|
17 | Markdown | `results/analysis.md` | Structured report with diagrams |
18 | HTML | `results/analysis.html` | Self-contained HTML with mermaid.js |
19 | PDF | `results/analysis.pdf` | Print-ready PDF (requires `planopticon[pdf]`) |
 
 
20
21 ## Diagrams
22
23 Each detected diagram produces:
24
25 | Format | File | Description |
26 |--------|------|-------------|
27 | JPEG | `diagrams/diagram_N.jpg` | Original frame |
28 | Mermaid | `diagrams/diagram_N.mermaid` | Mermaid source code |
29 | SVG | `diagrams/diagram_N.svg` | Vector rendering |
30 | PNG | `diagrams/diagram_N.png` | Raster rendering |
31 | JSON | `diagrams/diagram_N.json` | Structured analysis data |
 
 
 
 
 
 
32
33 ## Structured data
34
35 | Format | File | Description |
36 |--------|------|-------------|
37 | JSON | `results/knowledge_graph.json` | Entities and relationships |
38 | JSON | `results/key_points.json` | Extracted key points |
39 | JSON | `results/action_items.json` | Action items with assignees |
40 | JSON | `manifest.json` | Complete run manifest |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
42 ## Charts
43
44 When chart data is extracted from diagrams (bar, line, pie, scatter), PlanOpticon reproduces them:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
45
46 - SVG + PNG via matplotlib
47 - Embedded in HTML/PDF reports
 
48
49 DDED docs/guide/planning-agent.md
--- docs/guide/output-formats.md
+++ docs/guide/output-formats.md
@@ -1,47 +1,329 @@
1 # Output Formats
2
3 PlanOpticon produces a wide range of output formats from video analysis, document ingestion, batch processing, knowledge graph export, and agent skills. This page is the comprehensive reference for every format the tool can emit.
4
5 ---
6
7 ## Transcripts
8
9 Video analysis always produces transcripts in three formats, stored in the `transcript/` subdirectory of the output folder.
10
11 | Format | File | Description |
12 |--------|------|-------------|
13 | JSON | `transcript/transcript.json` | Full transcript with segments, timestamps, speaker labels, and confidence scores. Each segment includes `start`, `end`, `text`, and optional `speaker` fields. |
14 | Text | `transcript/transcript.txt` | Plain text transcript with no metadata. Suitable for feeding into other tools or reading directly. |
15 | SRT | `transcript/transcript.srt` | SubRip subtitle format with sequential numbering and `HH:MM:SS,mmm` timestamps. Can be loaded into video players or subtitle editors. |
16
17 ### Transcript JSON structure
18
19 ```json
20 {
21 "segments": [
22 {
23 "start": 0.0,
24 "end": 4.5,
25 "text": "Welcome to the sprint review.",
26 "speaker": "Alice"
27 }
28 ],
29 "text": "Welcome to the sprint review. ...",
30 "language": "en"
31 }
32 ```
33
34 When the `--speakers` flag is provided (e.g., `--speakers "Alice,Bob,Carol"`), speaker diarization hints are passed to the transcription provider and speaker labels appear in the JSON segments.
35
36 ---
37
38 ## Reports
39
40 Analysis reports are generated from the combined transcript, diagrams, key points, action items, and knowledge graph. They live in the `results/` subdirectory.
41
42 | Format | File | Description |
43 |--------|------|-------------|
44 | Markdown | `results/analysis.md` | Structured report with embedded Mermaid diagram blocks, tables, and cross-references. Works in any Markdown renderer. |
45 | HTML | `results/analysis.html` | Self-contained HTML page with inline CSS, embedded SVG diagrams, and a bundled mermaid.js script for rendering any unrendered Mermaid blocks. No external dependencies required to view. |
46 | PDF | `results/analysis.pdf` | Print-ready PDF. Requires the `planopticon[pdf]` extra (`pip install planopticon[pdf]`). Generated from the HTML report. |
47
48 ---
49
50 ## Diagrams
51
52 Each visual element detected during frame analysis produces up to five output files in the `diagrams/` subdirectory. The index `N` is zero-based.
53
54 | Format | File | Description |
55 |--------|------|-------------|
56 | JPEG | `diagrams/diagram_N.jpg` | Original video frame captured at the point of detection. |
57 | Mermaid | `diagrams/diagram_N.mermaid` | Mermaid source code reconstructed from the diagram by the vision model. Supports flowcharts, sequence diagrams, architecture diagrams, and more. |
58 | SVG | `diagrams/diagram_N.svg` | Vector rendering of the Mermaid source, produced by the Mermaid CLI or built-in renderer. |
59 | PNG | `diagrams/diagram_N.png` | Raster rendering of the Mermaid source at high resolution. |
60 | JSON | `diagrams/diagram_N.json` | Structured analysis data including diagram type, description, extracted text, chart data (if applicable), and confidence score. |
61
62 Frames that score as medium confidence are saved as captioned screenshots in the `captures/` subdirectory instead, with a `capture_N.jpg` and `capture_N.json` pair.
63
64 ---
65
66 ## Structured Data
67
68 Core analysis artifacts are stored as JSON files in the `results/` subdirectory.
69
70 | Format | File | Description |
71 |--------|------|-------------|
72 | SQLite | `results/knowledge_graph.db` | Primary knowledge graph database. SQLite-based, queryable with `planopticon query`. Contains entities, relationships, source provenance, and metadata. This is the preferred format for querying and merging. |
73 | JSON | `results/knowledge_graph.json` | JSON export of the knowledge graph. Contains `entities` and `relationships` arrays. Automatically kept in sync with the `.db` file. Used as a fallback when SQLite is not available. |
74 | JSON | `results/key_points.json` | Array of extracted key points, each with `text`, `category`, and `confidence` fields. |
75 | JSON | `results/action_items.json` | Array of action items, each with `text`, `assignee`, `due_date`, `priority`, and `status` fields. |
76 | JSON | `manifest.json` | Complete run manifest. The single source of truth for the analysis run. Contains video metadata, processing stats, file paths to all outputs, and inline key points, action items, diagram metadata, and screen captures. |
77
78 ### Knowledge graph JSON structure
79
80 ```json
81 {
82 "entities": [
83 {
84 "name": "Kubernetes",
85 "type": "technology",
86 "descriptions": ["Container orchestration platform discussed in architecture review"],
87 "occurrences": [
88 {"source": "video:recording.mp4", "timestamp": "00:05:23"}
89 ]
90 }
91 ],
92 "relationships": [
93 {
94 "source": "Kubernetes",
95 "target": "Docker",
96 "type": "DEPENDS_ON",
97 "descriptions": ["Kubernetes uses Docker as container runtime"]
98 }
99 ]
100 }
101 ```
102
103 ---
104
105 ## Charts
106
107 When chart data is extracted from diagrams (bar charts, line charts, pie charts, scatter plots), PlanOpticon reproduces them as standalone image files.
108
109 | Format | File | Description |
110 |--------|------|-------------|
111 | SVG | `diagrams/chart_N.svg` | Vector chart rendered via matplotlib. Suitable for embedding in documents or scaling to any size. |
112 | PNG | `diagrams/chart_N.png` | Raster chart rendered via matplotlib at 150 DPI. |
113
114 Reproduced charts are also embedded inline in the HTML and PDF reports.
115
116 ---
117
118 ## Knowledge Graph Exports
119
120 Beyond the default `knowledge_graph.db` and `knowledge_graph.json` produced during analysis, PlanOpticon supports exporting knowledge graphs to several additional formats via the `planopticon export` and `planopticon kg convert` commands.
121
122 | Format | Command / File | Description |
123 |--------|---------------|-------------|
124 | JSON | `knowledge_graph.json` | Default JSON export. Produced automatically alongside the `.db` file. |
125 | SQLite | `knowledge_graph.db` | Primary database format. Can be converted to/from JSON with `planopticon kg convert`. |
126 | GraphML | `output.graphml` | XML-based graph format via `planopticon kg convert kg.db output.graphml`. Compatible with Gephi, yEd, Cytoscape, and other graph visualization tools. |
127 | CSV | `export/entities.csv`, `export/relationships.csv` | Tabular export via `planopticon export markdown kg.db --type csv`. Produces separate CSV files for entities and relationships. |
128 | Mermaid | Inline in reports | Mermaid graph diagrams are embedded in Markdown and HTML reports. Also available programmatically via `GraphQueryEngine.to_mermaid()`. |
129
130 ### Converting between formats
131
132 ```bash
133 # SQLite to JSON
134 planopticon kg convert results/knowledge_graph.db output.json
135
136 # JSON to SQLite
137 planopticon kg convert knowledge_graph.json knowledge_graph.db
138
139 # Sync both directions (updates the stale file)
140 planopticon kg sync results/knowledge_graph.db
141 planopticon kg sync knowledge_graph.db knowledge_graph.json --direction db-to-json
142 ```
143
144 ---
145
146 ## PlanOpticonExchange Format
147
148 The PlanOpticonExchange format (`.json`) is a canonical interchange payload designed for sharing knowledge graphs between PlanOpticon instances, teams, or external systems.
149
150 ```bash
151 planopticon export exchange knowledge_graph.db
152 planopticon export exchange kg.db -o exchange.json --name "My Project"
153 ```
154
155 The exchange payload includes:
156
157 - **Schema version** for forward compatibility
158 - **Project metadata** (name, description)
159 - **Full entity and relationship data** with provenance
160 - **Source tracking** for multi-source graphs
161 - **Merge support** -- exchange files can be merged together, deduplicating entities by name
162
163 ### Exchange JSON structure
164
165 ```json
166 {
167 "schema_version": "1.0",
168 "project": {
169 "name": "Sprint Reviews Q4",
170 "description": "Knowledge extracted from Q4 sprint review recordings"
171 },
172 "entities": [...],
173 "relationships": [...],
174 "sources": [...]
175 }
176 ```
177
178 ---
179
180 ## Document Exports
181
182 PlanOpticon can generate structured Markdown documents from any knowledge graph, with no API key required. These are pure template-based outputs derived from the graph data.
183
184 ### Markdown document types
185
186 There are seven document types plus a CSV export, all generated via `planopticon export markdown`:
187
188 | Type | File | Description |
189 |------|------|-------------|
190 | `summary` | `executive_summary.md` | High-level executive summary with entity counts, top relationships, and key themes. |
191 | `meeting-notes` | `meeting_notes.md` | Structured meeting notes with attendees, topics discussed, decisions made, and action items. |
192 | `glossary` | `glossary.md` | Alphabetical glossary of all entities with descriptions and types. |
193 | `relationship-map` | `relationship_map.md` | Textual and Mermaid-based relationship map showing how entities connect. |
194 | `status-report` | `status_report.md` | Status report format with progress indicators, risks, and next steps. |
195 | `entity-index` | `entity_index.md` | Comprehensive index of all entities grouped by type, with links to individual briefs. |
196 | `entity-brief` | `entities/<Name>.md` | One-pager brief for each entity, showing descriptions, relationships, and source references. |
197 | `csv` | `entities.csv` | Tabular CSV export of entities and relationships. |
198
199 ```bash
200 # Generate all document types
201 planopticon export markdown knowledge_graph.db
202
203 # Generate specific types
204 planopticon export markdown kg.db -o ./docs --type summary --type glossary
205
206 # Generate meeting notes and CSV
207 planopticon export markdown kg.db --type meeting-notes --type csv
208 ```
209
210 ### Obsidian vault export
211
212 Exports the knowledge graph as an Obsidian-compatible vault with YAML frontmatter, `[[wiki-links]]` between entities, and proper folder structure.
213
214 ```bash
215 planopticon export obsidian knowledge_graph.db -o ./my-vault
216 ```
217
218 The vault includes:
219
220 - One note per entity with frontmatter (`type`, `aliases`, `tags`)
221 - Wiki-links between related entities
222 - A `_Index.md` file for navigation
223 - Compatible with Obsidian graph view
224
225 ### Notion markdown export
226
227 Exports as Notion-compatible Markdown with a CSV database file for import into Notion databases.
228
229 ```bash
230 planopticon export notion knowledge_graph.db -o ./notion-export
231 ```
232
233 ### GitHub wiki export
234
235 Generates a complete GitHub wiki with a sidebar, home page, and per-entity pages. Can be pushed directly to a GitHub wiki repository.
236
237 ```bash
238 # Generate wiki pages
239 planopticon wiki generate knowledge_graph.db -o ./wiki
240
241 # Push to GitHub
242 planopticon wiki push ./wiki ConflictHQ/PlanOpticon -m "Update wiki from KG"
243 ```
244
245 ---
246
247 ## Batch Outputs
248
249 Batch processing produces additional files at the batch root directory, alongside per-video output folders.
250
251 | Format | File | Description |
252 |--------|------|-------------|
253 | JSON | `batch_manifest.json` | Batch-level manifest with aggregate stats, per-video status (completed/failed), error details, and paths to all sub-outputs. |
254 | Markdown | `batch_summary.md` | Aggregated summary report with combined key points, action items, entity counts, and a Mermaid diagram of the merged knowledge graph. |
255 | SQLite | `knowledge_graph.db` | Merged knowledge graph combining entities and relationships across all successfully processed videos. Uses fuzzy matching and conflict resolution. |
256 | JSON | `knowledge_graph.json` | JSON export of the merged knowledge graph. |
257
258 ---
259
260 ## Self-Contained HTML Viewer
261
262 PlanOpticon ships with a self-contained interactive knowledge graph viewer at `knowledge-base/viewer.html` in the repository. This file:
263
264 - Uses D3.js (bundled inline, no CDN dependency)
265 - Renders an interactive force-directed graph visualization
266 - Supports node filtering by entity type
267 - Shows entity details and relationships on click
268 - Can load any `knowledge_graph.json` file
269 - Works offline with no server required -- just open in a browser
270 - Covers approximately 80% of graph exploration needs with zero infrastructure
271
272 ---
273
274 ## Output Directory Structure
275
276 A complete single-video analysis produces the following directory tree:
277
278 ```
279 output/
280 ├── manifest.json # Run manifest (source of truth)
281 ├── transcript/
282 │ ├── transcript.json # Full transcript with segments
283 │ ├── transcript.txt # Plain text
284 │ └── transcript.srt # Subtitles
285 ├── frames/
286 │ ├── frame_0000.jpg # Extracted video frames
287 │ ├── frame_0001.jpg
288 │ └── ...
289 ├── diagrams/
290 │ ├── diagram_0.jpg # Original frame
291 │ ├── diagram_0.mermaid # Mermaid source
292 │ ├── diagram_0.svg # Vector rendering
293 │ ├── diagram_0.png # Raster rendering
294 │ ├── diagram_0.json # Analysis data
295 │ ├── chart_0.svg # Reproduced chart (SVG)
296 │ ├── chart_0.png # Reproduced chart (PNG)
297 │ └── ...
298 ├── captures/
299 │ ├── capture_0.jpg # Medium-confidence screenshots
300 │ ├── capture_0.json # Caption and metadata
301 │ └── ...
302 └── results/
303 ├── analysis.md # Markdown report
304 ├── analysis.html # HTML report
305 ├── analysis.pdf # PDF report (if planopticon[pdf] installed)
306 ├── knowledge_graph.db # Knowledge graph (SQLite, primary)
307 ├── knowledge_graph.json # Knowledge graph (JSON export)
308 ├── key_points.json # Extracted key points
309 └── action_items.json # Action items
310 ```
311
312 ---
313
314 ## Controlling Output Format
315
316 Use the `--output-format` flag with `planopticon analyze` to control how results are presented:
317
318 | Value | Behavior |
319 |-------|----------|
320 | `default` | Writes all output files to disk and prints a usage summary to stdout. |
321 | `json` | Writes all output files to disk and also emits the complete `VideoManifest` as structured JSON to stdout. Useful for piping into other tools or CI/CD pipelines. |
322
323 ```bash
324 # Standard output (files + console summary)
325 planopticon analyze -i video.mp4 -o ./output
326
327 # JSON manifest to stdout (for scripting)
328 planopticon analyze -i video.mp4 -o ./output --output-format json
329 ```
330
331 DDED docs/guide/planning-agent.md
--- a/docs/guide/planning-agent.md
+++ b/docs/guide/planning-agent.md
@@ -0,0 +1,425 @@
1
+# Planning Agent
2
+
3
+The Planning Agent is PlanOpticon's AI-powered system for synthesizing knowledge graph content into structured planning artifacts. It takes extracted entities and relationships from video analyses, document ingestions, and other sources, then uses LLM reasoning to produce project plans, PRDs, roadmaps, task breakdowns, GitHub issues, and more.
4
+
5
+---
6
+
7
+## How It Works
8
+
9
+The Planning Agent operates through a three-stage pipeline:
10
+
11
+### 1. Context Assembly
12
+
13
+The agent gathers context from all available sources:
14
+
15
+- **Knowledge graph** -- entity counts, types, relationships, and planning entities from the loaded KG
16
+- **Query engine** -- used to pull stats, entity lists, and relationship data for prompt construction
17
+- **Provider manager** -- the configured LLM provider used for generation
18
+- **Prior artifacts** -- any artifacts already generated in the session (skills can chain off each other)
19
+- **Conversation history** -- accumulated chat messages when running in interactive mode
20
+
21
+This context is bundled into an `AgentContext` dataclass that is shared across all skills.
22
+
23
+### 2. Skill Selection
24
+
25
+When the agent receives a user request, it determines which skills to run:
26
+
27
+**LLM-driven planning (with provider).** The agent constructs a prompt that includes the knowledge base summary, all available skill names and descriptions, and the user's request. The LLM returns a JSON array of skill names to execute in order, along with any parameters. For example, given "Create a project plan and break it into tasks," the LLM might select `["project_plan", "task_breakdown"]`.
28
+
29
+**Keyword fallback (without provider).** If no LLM provider is available, the agent falls back to simple keyword matching. It splits each skill name on underscores and checks whether any of those words appear in the user's request. For example, the request "generate a roadmap" would match the `roadmap` skill because "roadmap" appears in both the request and the skill name.
30
+
31
+### 3. Execution
32
+
33
+Selected skills are executed sequentially. Each skill:
34
+
35
+1. Checks `can_execute()` to verify the required context is available (by default, both a knowledge graph and an LLM provider must be present)
36
+2. Pulls relevant data from the knowledge graph via the query engine
37
+3. Constructs a detailed prompt for the LLM with extracted context
38
+4. Calls the LLM and parses the response
39
+5. Returns an `Artifact` object containing the generated content
40
+
41
+Each artifact is appended to `context.artifacts`, making it available to subsequent skills. This enables chaining -- for example, `task_breakdown` can feed into `github_issues`.
42
+
43
+---
44
+
45
+## AgentContext
46
+
47
+The `AgentContext` dataclass is the shared state object that connects all components of the planning agent system.
48
+
49
+```python
50
+@dataclass
51
+class AgentContext:
52
+ knowledge_graph: Any = None # KnowledgeGraph instance
53
+ query_engine: Any = None # GraphQueryEngine instance
54
+ provider_manager: Any = None # ProviderManager instance
55
+ planning_entities: List[Any] = field(default_factory=list)
56
+ user_requirements: Dict[str, Any] = field(default_factory=dict)
57
+ conversation_history: List[Dict[str, str]] = field(default_factory=list)
58
+ artifacts: List[Artifact] = field(default_factory=list)
59
+ config: Dict[str, Any] = field(default_factory=dict)
60
+```
61
+
62
+| Field | Purpose |
63
+|---|---|
64
+| `knowledge_graph` | The loaded `KnowledgeGraph` instance; provides access to entities, relationships, and graph operations |
65
+| `query_engine` | A `GraphQueryEngine` for running structured queries (stats, entities, neighbors, relationships) |
66
+| `provider_manager` | The `ProviderManager` that handles LLM API calls across providers |
67
+| `planning_entities` | Entities classified into the planning taxonomy (goals, requirements, risks, etc.) |
68
+| `user_requirements` | Structured requirements gathered from the `requirements_chat` skill |
69
+| `conversation_history` | Accumulated chat messages for interactive sessions |
70
+| `artifacts` | All artifacts generated during the session, enabling skill chaining |
71
+| `config` | Arbitrary configuration overrides |
72
+
73
+---
74
+
75
+## Artifacts
76
+
77
+Every skill returns an `Artifact` dataclass:
78
+
79
+```python
80
+@dataclass
81
+class Artifact:
82
+ name: str # Human-readable name (e.g., "Project Plan")
83
+ content: str # The generated content (markdown, JSON, etc.)
84
+ artifact_type: str # Type identifier: "project_plan", "prd", "roadmap", etc.
85
+ format: str = "markdown" # Content format: "markdown", "json", "mermaid"
86
+ metadata: Dict[str, Any] = field(default_factory=dict)
87
+```
88
+
89
+Artifacts are the currency of the agent system. They can be:
90
+
91
+- Displayed directly in the Companion REPL
92
+- Exported to disk via the `artifact_export` skill
93
+- Pushed to external tools via the `cli_adapter` skill
94
+- Chained into other skills (e.g., task breakdown feeds into GitHub issues)
95
+
96
+---
97
+
98
+## Skills Reference
99
+
100
+The agent ships with 11 built-in skills. Each skill is a class that extends `Skill` and self-registers at import time via `register_skill()`.
101
+
102
+### project_plan
103
+
104
+**Description:** Generate a structured project plan from knowledge graph.
105
+
106
+Pulls the full knowledge graph context (stats, entities, relationships, and planning entities grouped by type) and asks the LLM to produce a comprehensive project plan with:
107
+
108
+1. Executive Summary
109
+2. Goals and Objectives
110
+3. Scope
111
+4. Phases and Milestones
112
+5. Resource Requirements
113
+6. Risks and Mitigations
114
+7. Success Criteria
115
+
116
+**Artifact type:** `project_plan` | **Format:** markdown
117
+
118
+### prd
119
+
120
+**Description:** Generate a product requirements document (PRD) / feature spec.
121
+
122
+Filters planning entities to those of type `requirement`, `feature`, and `constraint`, then asks the LLM to generate a PRD with:
123
+
124
+1. Problem Statement
125
+2. User Stories
126
+3. Functional Requirements
127
+4. Non-Functional Requirements
128
+5. Acceptance Criteria
129
+6. Out of Scope
130
+
131
+If no pre-filtered entities match, the LLM derives requirements from the full knowledge graph context.
132
+
133
+**Artifact type:** `prd` | **Format:** markdown
134
+
135
+### roadmap
136
+
137
+**Description:** Generate a product/project roadmap.
138
+
139
+Focuses on planning entities of type `milestone`, `feature`, and `dependency`. Asks the LLM to produce a roadmap with:
140
+
141
+1. Vision and Strategy
142
+2. Phases (with timeline estimates)
143
+3. Key Dependencies
144
+4. A Mermaid Gantt chart summarizing the timeline
145
+
146
+**Artifact type:** `roadmap` | **Format:** markdown
147
+
148
+### task_breakdown
149
+
150
+**Description:** Break down goals into tasks with dependencies.
151
+
152
+Focuses on planning entities of type `goal`, `feature`, and `milestone`. Returns a JSON array of task objects, each containing:
153
+
154
+| Field | Type | Description |
155
+|---|---|---|
156
+| `id` | string | Task identifier (e.g., "T1", "T2") |
157
+| `title` | string | Short task title |
158
+| `description` | string | Detailed description |
159
+| `depends_on` | list | IDs of prerequisite tasks |
160
+| `priority` | string | `high`, `medium`, or `low` |
161
+| `estimate` | string | Effort estimate (e.g., "2d", "1w") |
162
+| `assignee_role` | string | Role needed to perform the task |
163
+
164
+**Artifact type:** `task_list` | **Format:** json
165
+
166
+### github_issues
167
+
168
+**Description:** Generate GitHub issues from task breakdown.
169
+
170
+Converts tasks into GitHub-ready issue objects. If a `task_list` artifact exists in the context, it is used as input. Otherwise, minimal issues are generated from the planning entities directly.
171
+
172
+Each issue includes a formatted body with description, priority, estimate, and dependencies, plus labels derived from the task priority.
173
+
174
+The skill also provides a `push_to_github(issues_json, repo)` function that shells out to the `gh` CLI to create actual issues. This is used by the `cli_adapter` skill.
175
+
176
+**Artifact type:** `issues` | **Format:** json
177
+
178
+### requirements_chat
179
+
180
+**Description:** Interactive requirements gathering via guided questions.
181
+
182
+Generates a structured requirements questionnaire based on the knowledge graph context. The questionnaire contains 8-12 targeted questions, each with:
183
+
184
+| Field | Type | Description |
185
+|---|---|---|
186
+| `id` | string | Question identifier (e.g., "Q1") |
187
+| `category` | string | `goals`, `constraints`, `priorities`, or `scope` |
188
+| `question` | string | The question text |
189
+| `context` | string | Why this question matters |
190
+
191
+The skill also provides a `gather_requirements(context, answers)` method that takes the completed Q&A and synthesizes structured requirements (goals, constraints, priorities, scope).
192
+
193
+**Artifact type:** `requirements` | **Format:** json
194
+
195
+### doc_generator
196
+
197
+**Description:** Generate technical documentation, ADRs, or meeting notes.
198
+
199
+Supports three document types, selected via the `doc_type` parameter:
200
+
201
+| `doc_type` | Output Structure |
202
+|---|---|
203
+| `technical_doc` (default) | Overview, Architecture, Components and Interfaces, Data Flow, Deployment and Configuration, API Reference |
204
+| `adr` | Title, Status (Proposed), Context, Decision, Consequences, Alternatives Considered |
205
+| `meeting_notes` | Meeting Summary, Key Discussion Points, Decisions Made, Action Items (with owners), Open Questions, Next Steps |
206
+
207
+**Artifact type:** `document` | **Format:** markdown
208
+
209
+### artifact_export
210
+
211
+**Description:** Export artifacts in agent-ready formats.
212
+
213
+Writes all artifacts accumulated in the context to a directory structure. Each artifact is written to a file based on its type:
214
+
215
+| Artifact Type | Filename |
216
+|---|---|
217
+| `project_plan` | `project_plan.md` |
218
+| `prd` | `prd.md` |
219
+| `roadmap` | `roadmap.md` |
220
+| `task_list` | `tasks.json` |
221
+| `issues` | `issues.json` |
222
+| `requirements` | `requirements.json` |
223
+| `document` | `docs/<name>.md` |
224
+
225
+A `manifest.json` is written alongside, listing all exported files with their names, types, and formats.
226
+
227
+**Artifact type:** `export_manifest` | **Format:** json
228
+
229
+Accepts an `output_dir` parameter (defaults to `plan/`).
230
+
231
+### cli_adapter
232
+
233
+**Description:** Push artifacts to external tools via their CLIs.
234
+
235
+Converts artifacts into CLI commands for external project management tools. Supported tools:
236
+
237
+| Tool | CLI | Example Command |
238
+|---|---|---|
239
+| `github` | `gh` | `gh issue create --title "..." --body "..." --label "..."` |
240
+| `jira` | `jira` | `jira issue create --summary "..." --description "..."` |
241
+| `linear` | `linear` | `linear issue create --title "..." --description "..."` |
242
+
243
+The skill checks whether the target CLI is available on the system and includes that status in the output. Commands are generated in dry-run mode by default.
244
+
245
+**Artifact type:** `cli_commands` | **Format:** json
246
+
247
+### notes_export
248
+
249
+**Description:** Export knowledge graph as structured notes (Obsidian, Notion).
250
+
251
+Exports the entire knowledge graph as a collection of markdown files optimized for a specific note-taking platform. Accepts a `format` parameter:
252
+
253
+**Obsidian format** creates:
254
+
255
+- One `.md` file per entity with YAML frontmatter, tags, and `[[wiki-links]]`
256
+- An `_Index.md` Map of Content grouping entities by type
257
+- Tag pages for each entity type
258
+- Artifact notes for any generated artifacts
259
+
260
+**Notion format** creates:
261
+
262
+- One `.md` file per entity with Notion-style callout blocks and relationship tables
263
+- An `entities_database.csv` for bulk import into a Notion database
264
+- An `Overview.md` page with stats and entity listings
265
+- Artifact pages
266
+
267
+**Artifact type:** `notes_export` | **Format:** markdown
268
+
269
+### wiki_generator
270
+
271
+**Description:** Generate a GitHub wiki from knowledge graph and artifacts.
272
+
273
+Generates a complete GitHub wiki structure as a dictionary of page names to markdown content. Creates:
274
+
275
+- **Home** page with entity type counts and links
276
+- **_Sidebar** navigation with entity types and artifacts
277
+- **Type index pages** with tables of entities per type
278
+- **Individual entity pages** with descriptions, outgoing/incoming relationships, and source occurrences
279
+- **Artifact pages** for any generated planning artifacts
280
+
281
+The skill also provides standalone functions `write_wiki(pages, output_dir)` to write pages to disk and `push_wiki(wiki_dir, repo)` to push directly to a GitHub wiki repository.
282
+
283
+**Artifact type:** `wiki` | **Format:** markdown
284
+
285
+---
286
+
287
+## CLI Usage
288
+
289
+### One-shot execution
290
+
291
+Run the agent with a request string. The agent selects and executes appropriate skills automatically.
292
+
293
+```bash
294
+# Generate a project plan
295
+planopticon agent "Create a project plan" --kb ./results
296
+
297
+# Generate a PRD
298
+planopticon agent "Write a PRD for the authentication system" --kb ./results
299
+
300
+# Break down into tasks
301
+planopticon agent "Break this into tasks and estimate effort" --kb ./results
302
+```
303
+
304
+### Export artifacts to disk
305
+
306
+Use `--export` to write generated artifacts to a directory:
307
+
308
+```bash
309
+planopticon agent "Create a full project plan with tasks" --kb ./results --export ./output
310
+```
311
+
312
+### Interactive mode
313
+
314
+Use `-I` for a multi-turn session where you can issue multiple requests:
315
+
316
+```bash
317
+planopticon agent -I --kb ./results
318
+```
319
+
320
+In interactive mode, the agent supports:
321
+
322
+- Free-text requests (executed via LLM skill selection)
323
+- `/plan` -- shortcut to generate a project plan
324
+- `/skills` -- list available skills
325
+- `quit`, `exit`, `q` -- end the session
326
+
327
+### Provider and model options
328
+
329
+```bash
330
+# Use a specific provider
331
+planopticon agent "Create a roadmap" --kb ./results -p anthropic
332
+
333
+# Use a specific model
334
+planopticon agent "Generate a PRD" --kb ./results --chat-model gpt-4o
335
+```
336
+
337
+### Auto-discovery
338
+
339
+If `--kb` is not specified, the agent uses `KBContext.auto_discover()` to find knowledge graphs in the workspace.
340
+
341
+---
342
+
343
+## Using Skills from the Companion REPL
344
+
345
+The Companion REPL provides direct access to agent skills through slash commands. See the [Companion guide](companion.md) for full details.
346
+
347
+| Companion Command | Skill Executed |
348
+|---|---|
349
+| `/plan` | `project_plan` |
350
+| `/prd` | `prd` |
351
+| `/tasks` | `task_breakdown` |
352
+| `/run SKILL_NAME` | Any registered skill by name |
353
+
354
+When executed from the Companion, skills use the same `AgentContext` that powers the chat mode. This means:
355
+
356
+- The knowledge graph loaded at startup is automatically available
357
+- The active LLM provider (set via `/provider` or `/model`) is used for generation
358
+- Generated artifacts accumulate across the session, enabling chaining
359
+
360
+---
361
+
362
+## Example Workflows
363
+
364
+### From video to project plan
365
+
366
+```bash
367
+# 1. Analyze a video
368
+planopticon analyze -i sprint-review.mp4 -o results/
369
+
370
+# 2. Launch the agent with the results
371
+planopticon agent "Create a comprehensive project plan with tasks and a roadmap" \
372
+ --kb results/ --export plan/
373
+
374
+# 3. Review the generated artifacts
375
+ls plan/
376
+# project_plan.md roadmap.md tasks.json manifest.json
377
+```
378
+
379
+### Interactive planning session
380
+
381
+```bash
382
+$ planopticon companion --kb ./results
383
+
384
+planopticon> /status
385
+Workspace status:
386
+ KG: knowledge_graph.db (58 entities, 124 relationships)
387
+ ...
388
+
389
+planopticon> What are the main goals discussed?
390
+Based on the knowledge graph, the main goals are...
391
+
392
+planopticon> /plan
393
+--- Project Plan (project_plan) ---
394
+...
395
+
396
+planopticon> /tasks
397
+--- Task Breakdown (task_list) ---
398
+...
399
+
400
+planopticon> /run github_issues
401
+--- GitHub Issues (issues) ---
402
+[
403
+ {"title": "Set up authentication service", ...},
404
+ ...
405
+]
406
+
407
+planopticon> /run artifact_export
408
+--- Export Manifest (export_manifest) ---
409
+{
410
+ "artifact_count": 3,
411
+ "output_dir": "plan",
412
+ "files": [...]
413
+}
414
+```
415
+
416
+### Skill chaining
417
+
418
+Skills that produce artifacts make them available to subsequent skills automatically:
419
+
420
+1. `/tasks` generates a `task_list` artifact
421
+2. `/run github_issues` detects the existing `task_list` artifact and converts its tasks into GitHub issues
422
+3. `/run cli_adapter` takes the most recent artifact and generates `gh issue create` commands
423
+4. `/run artifact_export` writes all accumulated artifacts to disk with a manifest
424
+
425
+This chaining works both in the Companion REPL and in one-shot agent execution, since the `AgentContext.artifacts` list persists for the duration of the session.
--- a/docs/guide/planning-agent.md
+++ b/docs/guide/planning-agent.md
@@ -0,0 +1,425 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/guide/planning-agent.md
+++ b/docs/guide/planning-agent.md
@@ -0,0 +1,425 @@
1 # Planning Agent
2
3 The Planning Agent is PlanOpticon's AI-powered system for synthesizing knowledge graph content into structured planning artifacts. It takes extracted entities and relationships from video analyses, document ingestions, and other sources, then uses LLM reasoning to produce project plans, PRDs, roadmaps, task breakdowns, GitHub issues, and more.
4
5 ---
6
7 ## How It Works
8
9 The Planning Agent operates through a three-stage pipeline:
10
11 ### 1. Context Assembly
12
13 The agent gathers context from all available sources:
14
15 - **Knowledge graph** -- entity counts, types, relationships, and planning entities from the loaded KG
16 - **Query engine** -- used to pull stats, entity lists, and relationship data for prompt construction
17 - **Provider manager** -- the configured LLM provider used for generation
18 - **Prior artifacts** -- any artifacts already generated in the session (skills can chain off each other)
19 - **Conversation history** -- accumulated chat messages when running in interactive mode
20
21 This context is bundled into an `AgentContext` dataclass that is shared across all skills.
22
23 ### 2. Skill Selection
24
25 When the agent receives a user request, it determines which skills to run:
26
27 **LLM-driven planning (with provider).** The agent constructs a prompt that includes the knowledge base summary, all available skill names and descriptions, and the user's request. The LLM returns a JSON array of skill names to execute in order, along with any parameters. For example, given "Create a project plan and break it into tasks," the LLM might select `["project_plan", "task_breakdown"]`.
28
29 **Keyword fallback (without provider).** If no LLM provider is available, the agent falls back to simple keyword matching. It splits each skill name on underscores and checks whether any of those words appear in the user's request. For example, the request "generate a roadmap" would match the `roadmap` skill because "roadmap" appears in both the request and the skill name.
30
31 ### 3. Execution
32
33 Selected skills are executed sequentially. Each skill:
34
35 1. Checks `can_execute()` to verify the required context is available (by default, both a knowledge graph and an LLM provider must be present)
36 2. Pulls relevant data from the knowledge graph via the query engine
37 3. Constructs a detailed prompt for the LLM with extracted context
38 4. Calls the LLM and parses the response
39 5. Returns an `Artifact` object containing the generated content
40
41 Each artifact is appended to `context.artifacts`, making it available to subsequent skills. This enables chaining -- for example, `task_breakdown` can feed into `github_issues`.
42
43 ---
44
45 ## AgentContext
46
47 The `AgentContext` dataclass is the shared state object that connects all components of the planning agent system.
48
49 ```python
50 @dataclass
51 class AgentContext:
52 knowledge_graph: Any = None # KnowledgeGraph instance
53 query_engine: Any = None # GraphQueryEngine instance
54 provider_manager: Any = None # ProviderManager instance
55 planning_entities: List[Any] = field(default_factory=list)
56 user_requirements: Dict[str, Any] = field(default_factory=dict)
57 conversation_history: List[Dict[str, str]] = field(default_factory=list)
58 artifacts: List[Artifact] = field(default_factory=list)
59 config: Dict[str, Any] = field(default_factory=dict)
60 ```
61
62 | Field | Purpose |
63 |---|---|
64 | `knowledge_graph` | The loaded `KnowledgeGraph` instance; provides access to entities, relationships, and graph operations |
65 | `query_engine` | A `GraphQueryEngine` for running structured queries (stats, entities, neighbors, relationships) |
66 | `provider_manager` | The `ProviderManager` that handles LLM API calls across providers |
67 | `planning_entities` | Entities classified into the planning taxonomy (goals, requirements, risks, etc.) |
68 | `user_requirements` | Structured requirements gathered from the `requirements_chat` skill |
69 | `conversation_history` | Accumulated chat messages for interactive sessions |
70 | `artifacts` | All artifacts generated during the session, enabling skill chaining |
71 | `config` | Arbitrary configuration overrides |
72
73 ---
74
75 ## Artifacts
76
77 Every skill returns an `Artifact` dataclass:
78
79 ```python
80 @dataclass
81 class Artifact:
82 name: str # Human-readable name (e.g., "Project Plan")
83 content: str # The generated content (markdown, JSON, etc.)
84 artifact_type: str # Type identifier: "project_plan", "prd", "roadmap", etc.
85 format: str = "markdown" # Content format: "markdown", "json", "mermaid"
86 metadata: Dict[str, Any] = field(default_factory=dict)
87 ```
88
89 Artifacts are the currency of the agent system. They can be:
90
91 - Displayed directly in the Companion REPL
92 - Exported to disk via the `artifact_export` skill
93 - Pushed to external tools via the `cli_adapter` skill
94 - Chained into other skills (e.g., task breakdown feeds into GitHub issues)
95
96 ---
97
98 ## Skills Reference
99
100 The agent ships with 11 built-in skills. Each skill is a class that extends `Skill` and self-registers at import time via `register_skill()`.
101
102 ### project_plan
103
104 **Description:** Generate a structured project plan from knowledge graph.
105
106 Pulls the full knowledge graph context (stats, entities, relationships, and planning entities grouped by type) and asks the LLM to produce a comprehensive project plan with:
107
108 1. Executive Summary
109 2. Goals and Objectives
110 3. Scope
111 4. Phases and Milestones
112 5. Resource Requirements
113 6. Risks and Mitigations
114 7. Success Criteria
115
116 **Artifact type:** `project_plan` | **Format:** markdown
117
118 ### prd
119
120 **Description:** Generate a product requirements document (PRD) / feature spec.
121
122 Filters planning entities to those of type `requirement`, `feature`, and `constraint`, then asks the LLM to generate a PRD with:
123
124 1. Problem Statement
125 2. User Stories
126 3. Functional Requirements
127 4. Non-Functional Requirements
128 5. Acceptance Criteria
129 6. Out of Scope
130
131 If no pre-filtered entities match, the LLM derives requirements from the full knowledge graph context.
132
133 **Artifact type:** `prd` | **Format:** markdown
134
135 ### roadmap
136
137 **Description:** Generate a product/project roadmap.
138
139 Focuses on planning entities of type `milestone`, `feature`, and `dependency`. Asks the LLM to produce a roadmap with:
140
141 1. Vision and Strategy
142 2. Phases (with timeline estimates)
143 3. Key Dependencies
144 4. A Mermaid Gantt chart summarizing the timeline
145
146 **Artifact type:** `roadmap` | **Format:** markdown
147
148 ### task_breakdown
149
150 **Description:** Break down goals into tasks with dependencies.
151
152 Focuses on planning entities of type `goal`, `feature`, and `milestone`. Returns a JSON array of task objects, each containing:
153
154 | Field | Type | Description |
155 |---|---|---|
156 | `id` | string | Task identifier (e.g., "T1", "T2") |
157 | `title` | string | Short task title |
158 | `description` | string | Detailed description |
159 | `depends_on` | list | IDs of prerequisite tasks |
160 | `priority` | string | `high`, `medium`, or `low` |
161 | `estimate` | string | Effort estimate (e.g., "2d", "1w") |
162 | `assignee_role` | string | Role needed to perform the task |
163
164 **Artifact type:** `task_list` | **Format:** json
165
166 ### github_issues
167
168 **Description:** Generate GitHub issues from task breakdown.
169
170 Converts tasks into GitHub-ready issue objects. If a `task_list` artifact exists in the context, it is used as input. Otherwise, minimal issues are generated from the planning entities directly.
171
172 Each issue includes a formatted body with description, priority, estimate, and dependencies, plus labels derived from the task priority.
173
174 The skill also provides a `push_to_github(issues_json, repo)` function that shells out to the `gh` CLI to create actual issues. This is used by the `cli_adapter` skill.
175
176 **Artifact type:** `issues` | **Format:** json
177
178 ### requirements_chat
179
180 **Description:** Interactive requirements gathering via guided questions.
181
182 Generates a structured requirements questionnaire based on the knowledge graph context. The questionnaire contains 8-12 targeted questions, each with:
183
184 | Field | Type | Description |
185 |---|---|---|
186 | `id` | string | Question identifier (e.g., "Q1") |
187 | `category` | string | `goals`, `constraints`, `priorities`, or `scope` |
188 | `question` | string | The question text |
189 | `context` | string | Why this question matters |
190
191 The skill also provides a `gather_requirements(context, answers)` method that takes the completed Q&A and synthesizes structured requirements (goals, constraints, priorities, scope).
192
193 **Artifact type:** `requirements` | **Format:** json
194
195 ### doc_generator
196
197 **Description:** Generate technical documentation, ADRs, or meeting notes.
198
199 Supports three document types, selected via the `doc_type` parameter:
200
201 | `doc_type` | Output Structure |
202 |---|---|
203 | `technical_doc` (default) | Overview, Architecture, Components and Interfaces, Data Flow, Deployment and Configuration, API Reference |
204 | `adr` | Title, Status (Proposed), Context, Decision, Consequences, Alternatives Considered |
205 | `meeting_notes` | Meeting Summary, Key Discussion Points, Decisions Made, Action Items (with owners), Open Questions, Next Steps |
206
207 **Artifact type:** `document` | **Format:** markdown
208
209 ### artifact_export
210
211 **Description:** Export artifacts in agent-ready formats.
212
213 Writes all artifacts accumulated in the context to a directory structure. Each artifact is written to a file based on its type:
214
215 | Artifact Type | Filename |
216 |---|---|
217 | `project_plan` | `project_plan.md` |
218 | `prd` | `prd.md` |
219 | `roadmap` | `roadmap.md` |
220 | `task_list` | `tasks.json` |
221 | `issues` | `issues.json` |
222 | `requirements` | `requirements.json` |
223 | `document` | `docs/<name>.md` |
224
225 A `manifest.json` is written alongside, listing all exported files with their names, types, and formats.
226
227 **Artifact type:** `export_manifest` | **Format:** json
228
229 Accepts an `output_dir` parameter (defaults to `plan/`).
230
231 ### cli_adapter
232
233 **Description:** Push artifacts to external tools via their CLIs.
234
235 Converts artifacts into CLI commands for external project management tools. Supported tools:
236
237 | Tool | CLI | Example Command |
238 |---|---|---|
239 | `github` | `gh` | `gh issue create --title "..." --body "..." --label "..."` |
240 | `jira` | `jira` | `jira issue create --summary "..." --description "..."` |
241 | `linear` | `linear` | `linear issue create --title "..." --description "..."` |
242
243 The skill checks whether the target CLI is available on the system and includes that status in the output. Commands are generated in dry-run mode by default.
244
245 **Artifact type:** `cli_commands` | **Format:** json
246
247 ### notes_export
248
249 **Description:** Export knowledge graph as structured notes (Obsidian, Notion).
250
251 Exports the entire knowledge graph as a collection of markdown files optimized for a specific note-taking platform. Accepts a `format` parameter:
252
253 **Obsidian format** creates:
254
255 - One `.md` file per entity with YAML frontmatter, tags, and `[[wiki-links]]`
256 - An `_Index.md` Map of Content grouping entities by type
257 - Tag pages for each entity type
258 - Artifact notes for any generated artifacts
259
260 **Notion format** creates:
261
262 - One `.md` file per entity with Notion-style callout blocks and relationship tables
263 - An `entities_database.csv` for bulk import into a Notion database
264 - An `Overview.md` page with stats and entity listings
265 - Artifact pages
266
267 **Artifact type:** `notes_export` | **Format:** markdown
268
269 ### wiki_generator
270
271 **Description:** Generate a GitHub wiki from knowledge graph and artifacts.
272
273 Generates a complete GitHub wiki structure as a dictionary of page names to markdown content. Creates:
274
275 - **Home** page with entity type counts and links
276 - **_Sidebar** navigation with entity types and artifacts
277 - **Type index pages** with tables of entities per type
278 - **Individual entity pages** with descriptions, outgoing/incoming relationships, and source occurrences
279 - **Artifact pages** for any generated planning artifacts
280
281 The skill also provides standalone functions `write_wiki(pages, output_dir)` to write pages to disk and `push_wiki(wiki_dir, repo)` to push directly to a GitHub wiki repository.
282
283 **Artifact type:** `wiki` | **Format:** markdown
284
285 ---
286
287 ## CLI Usage
288
289 ### One-shot execution
290
291 Run the agent with a request string. The agent selects and executes appropriate skills automatically.
292
293 ```bash
294 # Generate a project plan
295 planopticon agent "Create a project plan" --kb ./results
296
297 # Generate a PRD
298 planopticon agent "Write a PRD for the authentication system" --kb ./results
299
300 # Break down into tasks
301 planopticon agent "Break this into tasks and estimate effort" --kb ./results
302 ```
303
304 ### Export artifacts to disk
305
306 Use `--export` to write generated artifacts to a directory:
307
308 ```bash
309 planopticon agent "Create a full project plan with tasks" --kb ./results --export ./output
310 ```
311
312 ### Interactive mode
313
314 Use `-I` for a multi-turn session where you can issue multiple requests:
315
316 ```bash
317 planopticon agent -I --kb ./results
318 ```
319
320 In interactive mode, the agent supports:
321
322 - Free-text requests (executed via LLM skill selection)
323 - `/plan` -- shortcut to generate a project plan
324 - `/skills` -- list available skills
325 - `quit`, `exit`, `q` -- end the session
326
327 ### Provider and model options
328
329 ```bash
330 # Use a specific provider
331 planopticon agent "Create a roadmap" --kb ./results -p anthropic
332
333 # Use a specific model
334 planopticon agent "Generate a PRD" --kb ./results --chat-model gpt-4o
335 ```
336
337 ### Auto-discovery
338
339 If `--kb` is not specified, the agent uses `KBContext.auto_discover()` to find knowledge graphs in the workspace.
340
341 ---
342
343 ## Using Skills from the Companion REPL
344
345 The Companion REPL provides direct access to agent skills through slash commands. See the [Companion guide](companion.md) for full details.
346
347 | Companion Command | Skill Executed |
348 |---|---|
349 | `/plan` | `project_plan` |
350 | `/prd` | `prd` |
351 | `/tasks` | `task_breakdown` |
352 | `/run SKILL_NAME` | Any registered skill by name |
353
354 When executed from the Companion, skills use the same `AgentContext` that powers the chat mode. This means:
355
356 - The knowledge graph loaded at startup is automatically available
357 - The active LLM provider (set via `/provider` or `/model`) is used for generation
358 - Generated artifacts accumulate across the session, enabling chaining
359
360 ---
361
362 ## Example Workflows
363
364 ### From video to project plan
365
366 ```bash
367 # 1. Analyze a video
368 planopticon analyze -i sprint-review.mp4 -o results/
369
370 # 2. Launch the agent with the results
371 planopticon agent "Create a comprehensive project plan with tasks and a roadmap" \
372 --kb results/ --export plan/
373
374 # 3. Review the generated artifacts
375 ls plan/
376 # project_plan.md roadmap.md tasks.json manifest.json
377 ```
378
379 ### Interactive planning session
380
381 ```bash
382 $ planopticon companion --kb ./results
383
384 planopticon> /status
385 Workspace status:
386 KG: knowledge_graph.db (58 entities, 124 relationships)
387 ...
388
389 planopticon> What are the main goals discussed?
390 Based on the knowledge graph, the main goals are...
391
392 planopticon> /plan
393 --- Project Plan (project_plan) ---
394 ...
395
396 planopticon> /tasks
397 --- Task Breakdown (task_list) ---
398 ...
399
400 planopticon> /run github_issues
401 --- GitHub Issues (issues) ---
402 [
403 {"title": "Set up authentication service", ...},
404 ...
405 ]
406
407 planopticon> /run artifact_export
408 --- Export Manifest (export_manifest) ---
409 {
410 "artifact_count": 3,
411 "output_dir": "plan",
412 "files": [...]
413 }
414 ```
415
416 ### Skill chaining
417
418 Skills that produce artifacts make them available to subsequent skills automatically:
419
420 1. `/tasks` generates a `task_list` artifact
421 2. `/run github_issues` detects the existing `task_list` artifact and converts its tasks into GitHub issues
422 3. `/run cli_adapter` takes the most recent artifact and generates `gh issue create` commands
423 4. `/run artifact_export` writes all accumulated artifacts to disk with a manifest
424
425 This chaining works both in the Companion REPL and in one-shot agent execution, since the `AgentContext.artifacts` list persists for the duration of the session.
--- docs/guide/single-video.md
+++ docs/guide/single-video.md
@@ -8,22 +8,28 @@
88
99
## What happens
1010
1111
The pipeline runs these steps in order:
1212
13
-1. **Frame extraction** — Samples frames using change detection for transitions plus periodic capture (every 30s) for slow-evolving content like document scrolling
14
-2. **People frame filtering** — OpenCV face detection automatically removes webcam/video conference frames, keeping only shared content (slides, documents, screen shares)
15
-3. **Audio extraction** — Extracts audio track to WAV
16
-4. **Transcription** — Sends audio to speech-to-text (Whisper or Gemini)
17
-5. **Diagram detection** — Vision model classifies each frame as diagram/chart/whiteboard/screenshot/none
18
-6. **Diagram analysis** — High-confidence diagrams get full extraction (description, text, mermaid, chart data)
19
-7. **Screengrab fallback** — Medium-confidence frames are saved as captioned screenshots
20
-8. **Knowledge graph** — Extracts entities and relationships from transcript + diagrams
21
-9. **Key points** — LLM extracts main points and topics
22
-10. **Action items** — LLM finds tasks, commitments, and follow-ups
23
-11. **Reports** — Generates markdown, HTML, and PDF
24
-12. **Export** — Renders mermaid diagrams to SVG/PNG, reproduces charts
13
+1. **Frame extraction** -- Samples frames using change detection for transitions plus periodic capture (every 30s) for slow-evolving content like document scrolling
14
+2. **People frame filtering** -- OpenCV face detection automatically removes webcam/video conference frames, keeping only shared content (slides, documents, screen shares)
15
+3. **Audio extraction** -- Extracts audio track to WAV
16
+4. **Transcription** -- Sends audio to speech-to-text (Whisper or Gemini). If `--speakers` is provided, speaker diarization hints are passed to the provider.
17
+5. **Diagram detection** -- Vision model classifies each frame as diagram/chart/whiteboard/screenshot/none
18
+6. **Diagram analysis** -- High-confidence diagrams get full extraction (description, text, mermaid, chart data)
19
+7. **Screengrab fallback** -- Medium-confidence frames are saved as captioned screenshots
20
+8. **Knowledge graph** -- Extracts entities and relationships from transcript + diagrams, stored in both `knowledge_graph.db` (SQLite, primary) and `knowledge_graph.json` (export)
21
+9. **Key points** -- LLM extracts main points and topics
22
+10. **Action items** -- LLM finds tasks, commitments, and follow-ups
23
+11. **Reports** -- Generates markdown, HTML, and PDF
24
+12. **Export** -- Renders mermaid diagrams to SVG/PNG, reproduces charts
25
+
26
+After analysis, you can optionally run planning taxonomy classification on the knowledge graph to categorize entities for use with the planning agent:
27
+
28
+```bash
29
+planopticon kg classify results/knowledge_graph.db
30
+```
2531
2632
## Processing depth
2733
2834
### `basic`
2935
- Transcription only
@@ -30,18 +36,111 @@
3036
- Key points and action items
3137
- No diagram extraction
3238
3339
### `standard` (default)
3440
- Everything in basic
35
-- Diagram extraction (up to 10 frames)
41
+- Diagram extraction (up to 10 frames, evenly sampled)
3642
- Knowledge graph
3743
- Full report generation
3844
3945
### `comprehensive`
4046
- Everything in standard
4147
- More frames analyzed (up to 20)
4248
- Deeper analysis
49
+
50
+## Command-line options
51
+
52
+### Provider and model selection
53
+
54
+```bash
55
+# Use a specific provider
56
+planopticon analyze -i video.mp4 -o ./output --provider anthropic
57
+
58
+# Override vision and chat models separately
59
+planopticon analyze -i video.mp4 -o ./output --vision-model gpt-4o --chat-model claude-sonnet-4-20250514
60
+```
61
+
62
+### Speaker diarization hints
63
+
64
+Use `--speakers` to provide speaker names as comma-separated hints. These are passed to the transcription provider to improve speaker identification in the transcript segments.
65
+
66
+```bash
67
+planopticon analyze -i video.mp4 -o ./output --speakers "Alice,Bob,Carol"
68
+```
69
+
70
+### Custom prompt templates
71
+
72
+Use `--templates-dir` to point to a directory of custom `.txt` prompt template files. These override the built-in prompts used for diagram analysis, key point extraction, action item extraction, and other LLM-driven steps.
73
+
74
+```bash
75
+planopticon analyze -i video.mp4 -o ./output --templates-dir ./my-prompts
76
+```
77
+
78
+Template files should be named to match the built-in template names (e.g., `key_points.txt`, `action_items.txt`). See the `video_processor/utils/prompt_templates.py` module for the full list of template names.
79
+
80
+### Output format
81
+
82
+Use `--output-format json` to emit the complete `VideoManifest` as structured JSON to stdout, in addition to writing all output files to disk. This is useful for scripting, CI/CD integration, or piping results into other tools.
83
+
84
+```bash
85
+# Standard output (files + console summary)
86
+planopticon analyze -i video.mp4 -o ./output
87
+
88
+# JSON manifest to stdout
89
+planopticon analyze -i video.mp4 -o ./output --output-format json
90
+```
91
+
92
+### Frame extraction tuning
93
+
94
+```bash
95
+# Adjust sampling rate (frames per second to consider)
96
+planopticon analyze -i video.mp4 -o ./output --sampling-rate 1.0
97
+
98
+# Adjust change detection threshold (lower = more sensitive)
99
+planopticon analyze -i video.mp4 -o ./output --change-threshold 0.10
100
+
101
+# Adjust periodic capture interval
102
+planopticon analyze -i video.mp4 -o ./output --periodic-capture 60
103
+
104
+# Enable GPU acceleration for frame extraction
105
+planopticon analyze -i video.mp4 -o ./output --use-gpu
106
+```
107
+
108
+## Output structure
109
+
110
+Every run produces a standardized directory structure:
111
+
112
+```
113
+output/
114
+├── manifest.json # Run manifest (source of truth)
115
+├── transcript/
116
+│ ├── transcript.json # Full transcript with segments + speakers
117
+│ ├── transcript.txt # Plain text
118
+│ └── transcript.srt # Subtitles
119
+├── frames/
120
+│ ├── frame_0000.jpg
121
+│ └── ...
122
+├── diagrams/
123
+│ ├── diagram_0.jpg # Original frame
124
+│ ├── diagram_0.mermaid # Mermaid source
125
+│ ├── diagram_0.svg # Vector rendering
126
+│ ├── diagram_0.png # Raster rendering
127
+│ ├── diagram_0.json # Analysis data
128
+│ └── ...
129
+├── captures/
130
+│ ├── capture_0.jpg # Medium-confidence screenshots
131
+│ ├── capture_0.json
132
+│ └── ...
133
+└── results/
134
+ ├── analysis.md # Markdown report
135
+ ├── analysis.html # HTML report
136
+ ├── analysis.pdf # PDF (if planopticon[pdf] installed)
137
+ ├── knowledge_graph.db # Knowledge graph (SQLite, primary)
138
+ ├── knowledge_graph.json # Knowledge graph (JSON export)
139
+ ├── key_points.json # Extracted key points
140
+ └── action_items.json # Action items
141
+```
43142
44143
## Output manifest
45144
46145
Every run produces a `manifest.json` that is the single source of truth:
47146
@@ -56,13 +155,90 @@
56155
"stats": {
57156
"duration_seconds": 45.2,
58157
"frames_extracted": 42,
59158
"people_frames_filtered": 11,
60159
"diagrams_detected": 3,
61
- "screen_captures": 5
160
+ "screen_captures": 5,
161
+ "models_used": {
162
+ "vision": "gpt-4o",
163
+ "chat": "gpt-4o"
164
+ }
62165
},
166
+ "transcript_json": "transcript/transcript.json",
167
+ "transcript_txt": "transcript/transcript.txt",
168
+ "transcript_srt": "transcript/transcript.srt",
169
+ "analysis_md": "results/analysis.md",
170
+ "knowledge_graph_json": "results/knowledge_graph.json",
171
+ "knowledge_graph_db": "results/knowledge_graph.db",
172
+ "key_points_json": "results/key_points.json",
173
+ "action_items_json": "results/action_items.json",
63174
"key_points": [...],
64175
"action_items": [...],
65176
"diagrams": [...],
66177
"screen_captures": [...]
67178
}
68179
```
180
+
181
+## Checkpoint and resume
182
+
183
+The pipeline supports checkpoint/resume. If a step's output files already exist on disk, that step is skipped on re-run. This means you can safely re-run an interrupted analysis and it will pick up where it left off:
184
+
185
+```bash
186
+# First run (interrupted at step 6)
187
+planopticon analyze -i video.mp4 -o ./output
188
+
189
+# Second run (resumes from step 6)
190
+planopticon analyze -i video.mp4 -o ./output
191
+```
192
+
193
+## Using results after analysis
194
+
195
+### Query the knowledge graph
196
+
197
+After analysis completes, you can query the knowledge graph directly:
198
+
199
+```bash
200
+# Show graph stats
201
+planopticon query --db results/knowledge_graph.db
202
+
203
+# List entities by type
204
+planopticon query --db results/knowledge_graph.db "entities --type technology"
205
+
206
+# Find neighbors of an entity
207
+planopticon query --db results/knowledge_graph.db "neighbors Kubernetes"
208
+
209
+# Ask natural language questions (requires API key)
210
+planopticon query --db results/knowledge_graph.db "What technologies were discussed?"
211
+```
212
+
213
+### Classify entities for planning
214
+
215
+Run taxonomy classification to categorize entities into planning types (goal, milestone, risk, dependency, etc.):
216
+
217
+```bash
218
+planopticon kg classify results/knowledge_graph.db
219
+planopticon kg classify results/knowledge_graph.db --format json
220
+```
221
+
222
+### Export to other formats
223
+
224
+```bash
225
+# Generate markdown documents
226
+planopticon export markdown results/knowledge_graph.db -o ./docs
227
+
228
+# Export as Obsidian vault
229
+planopticon export obsidian results/knowledge_graph.db -o ./vault
230
+
231
+# Export as PlanOpticonExchange
232
+planopticon export exchange results/knowledge_graph.db -o exchange.json
233
+
234
+# Generate GitHub wiki
235
+planopticon wiki generate results/knowledge_graph.db -o ./wiki
236
+```
237
+
238
+### Use with the planning agent
239
+
240
+The planning agent can consume the knowledge graph to generate project plans, PRDs, roadmaps, and other planning artifacts:
241
+
242
+```bash
243
+planopticon agent --db results/knowledge_graph.db
244
+```
69245
70246
ADDED docs/use-cases.md
--- docs/guide/single-video.md
+++ docs/guide/single-video.md
@@ -8,22 +8,28 @@
8
9 ## What happens
10
11 The pipeline runs these steps in order:
12
13 1. **Frame extraction** — Samples frames using change detection for transitions plus periodic capture (every 30s) for slow-evolving content like document scrolling
14 2. **People frame filtering** — OpenCV face detection automatically removes webcam/video conference frames, keeping only shared content (slides, documents, screen shares)
15 3. **Audio extraction** — Extracts audio track to WAV
16 4. **Transcription** — Sends audio to speech-to-text (Whisper or Gemini)
17 5. **Diagram detection** — Vision model classifies each frame as diagram/chart/whiteboard/screenshot/none
18 6. **Diagram analysis** — High-confidence diagrams get full extraction (description, text, mermaid, chart data)
19 7. **Screengrab fallback** — Medium-confidence frames are saved as captioned screenshots
20 8. **Knowledge graph** — Extracts entities and relationships from transcript + diagrams
21 9. **Key points** — LLM extracts main points and topics
22 10. **Action items** — LLM finds tasks, commitments, and follow-ups
23 11. **Reports** — Generates markdown, HTML, and PDF
24 12. **Export** — Renders mermaid diagrams to SVG/PNG, reproduces charts
 
 
 
 
 
 
25
26 ## Processing depth
27
28 ### `basic`
29 - Transcription only
@@ -30,18 +36,111 @@
30 - Key points and action items
31 - No diagram extraction
32
33 ### `standard` (default)
34 - Everything in basic
35 - Diagram extraction (up to 10 frames)
36 - Knowledge graph
37 - Full report generation
38
39 ### `comprehensive`
40 - Everything in standard
41 - More frames analyzed (up to 20)
42 - Deeper analysis
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
44 ## Output manifest
45
46 Every run produces a `manifest.json` that is the single source of truth:
47
@@ -56,13 +155,90 @@
56 "stats": {
57 "duration_seconds": 45.2,
58 "frames_extracted": 42,
59 "people_frames_filtered": 11,
60 "diagrams_detected": 3,
61 "screen_captures": 5
 
 
 
 
62 },
 
 
 
 
 
 
 
 
63 "key_points": [...],
64 "action_items": [...],
65 "diagrams": [...],
66 "screen_captures": [...]
67 }
68 ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
70 DDED docs/use-cases.md
--- docs/guide/single-video.md
+++ docs/guide/single-video.md
@@ -8,22 +8,28 @@
8
9 ## What happens
10
11 The pipeline runs these steps in order:
12
13 1. **Frame extraction** -- Samples frames using change detection for transitions plus periodic capture (every 30s) for slow-evolving content like document scrolling
14 2. **People frame filtering** -- OpenCV face detection automatically removes webcam/video conference frames, keeping only shared content (slides, documents, screen shares)
15 3. **Audio extraction** -- Extracts audio track to WAV
16 4. **Transcription** -- Sends audio to speech-to-text (Whisper or Gemini). If `--speakers` is provided, speaker diarization hints are passed to the provider.
17 5. **Diagram detection** -- Vision model classifies each frame as diagram/chart/whiteboard/screenshot/none
18 6. **Diagram analysis** -- High-confidence diagrams get full extraction (description, text, mermaid, chart data)
19 7. **Screengrab fallback** -- Medium-confidence frames are saved as captioned screenshots
20 8. **Knowledge graph** -- Extracts entities and relationships from transcript + diagrams, stored in both `knowledge_graph.db` (SQLite, primary) and `knowledge_graph.json` (export)
21 9. **Key points** -- LLM extracts main points and topics
22 10. **Action items** -- LLM finds tasks, commitments, and follow-ups
23 11. **Reports** -- Generates markdown, HTML, and PDF
24 12. **Export** -- Renders mermaid diagrams to SVG/PNG, reproduces charts
25
26 After analysis, you can optionally run planning taxonomy classification on the knowledge graph to categorize entities for use with the planning agent:
27
28 ```bash
29 planopticon kg classify results/knowledge_graph.db
30 ```
31
32 ## Processing depth
33
34 ### `basic`
35 - Transcription only
@@ -30,18 +36,111 @@
36 - Key points and action items
37 - No diagram extraction
38
39 ### `standard` (default)
40 - Everything in basic
41 - Diagram extraction (up to 10 frames, evenly sampled)
42 - Knowledge graph
43 - Full report generation
44
45 ### `comprehensive`
46 - Everything in standard
47 - More frames analyzed (up to 20)
48 - Deeper analysis
49
50 ## Command-line options
51
52 ### Provider and model selection
53
54 ```bash
55 # Use a specific provider
56 planopticon analyze -i video.mp4 -o ./output --provider anthropic
57
58 # Override vision and chat models separately
59 planopticon analyze -i video.mp4 -o ./output --vision-model gpt-4o --chat-model claude-sonnet-4-20250514
60 ```
61
62 ### Speaker diarization hints
63
64 Use `--speakers` to provide speaker names as comma-separated hints. These are passed to the transcription provider to improve speaker identification in the transcript segments.
65
66 ```bash
67 planopticon analyze -i video.mp4 -o ./output --speakers "Alice,Bob,Carol"
68 ```
69
70 ### Custom prompt templates
71
72 Use `--templates-dir` to point to a directory of custom `.txt` prompt template files. These override the built-in prompts used for diagram analysis, key point extraction, action item extraction, and other LLM-driven steps.
73
74 ```bash
75 planopticon analyze -i video.mp4 -o ./output --templates-dir ./my-prompts
76 ```
77
78 Template files should be named to match the built-in template names (e.g., `key_points.txt`, `action_items.txt`). See the `video_processor/utils/prompt_templates.py` module for the full list of template names.
79
80 ### Output format
81
82 Use `--output-format json` to emit the complete `VideoManifest` as structured JSON to stdout, in addition to writing all output files to disk. This is useful for scripting, CI/CD integration, or piping results into other tools.
83
84 ```bash
85 # Standard output (files + console summary)
86 planopticon analyze -i video.mp4 -o ./output
87
88 # JSON manifest to stdout
89 planopticon analyze -i video.mp4 -o ./output --output-format json
90 ```
91
92 ### Frame extraction tuning
93
94 ```bash
95 # Adjust sampling rate (frames per second to consider)
96 planopticon analyze -i video.mp4 -o ./output --sampling-rate 1.0
97
98 # Adjust change detection threshold (lower = more sensitive)
99 planopticon analyze -i video.mp4 -o ./output --change-threshold 0.10
100
101 # Adjust periodic capture interval
102 planopticon analyze -i video.mp4 -o ./output --periodic-capture 60
103
104 # Enable GPU acceleration for frame extraction
105 planopticon analyze -i video.mp4 -o ./output --use-gpu
106 ```
107
108 ## Output structure
109
110 Every run produces a standardized directory structure:
111
112 ```
113 output/
114 ├── manifest.json # Run manifest (source of truth)
115 ├── transcript/
116 │ ├── transcript.json # Full transcript with segments + speakers
117 │ ├── transcript.txt # Plain text
118 │ └── transcript.srt # Subtitles
119 ├── frames/
120 │ ├── frame_0000.jpg
121 │ └── ...
122 ├── diagrams/
123 │ ├── diagram_0.jpg # Original frame
124 │ ├── diagram_0.mermaid # Mermaid source
125 │ ├── diagram_0.svg # Vector rendering
126 │ ├── diagram_0.png # Raster rendering
127 │ ├── diagram_0.json # Analysis data
128 │ └── ...
129 ├── captures/
130 │ ├── capture_0.jpg # Medium-confidence screenshots
131 │ ├── capture_0.json
132 │ └── ...
133 └── results/
134 ├── analysis.md # Markdown report
135 ├── analysis.html # HTML report
136 ├── analysis.pdf # PDF (if planopticon[pdf] installed)
137 ├── knowledge_graph.db # Knowledge graph (SQLite, primary)
138 ├── knowledge_graph.json # Knowledge graph (JSON export)
139 ├── key_points.json # Extracted key points
140 └── action_items.json # Action items
141 ```
142
143 ## Output manifest
144
145 Every run produces a `manifest.json` that is the single source of truth:
146
@@ -56,13 +155,90 @@
155 "stats": {
156 "duration_seconds": 45.2,
157 "frames_extracted": 42,
158 "people_frames_filtered": 11,
159 "diagrams_detected": 3,
160 "screen_captures": 5,
161 "models_used": {
162 "vision": "gpt-4o",
163 "chat": "gpt-4o"
164 }
165 },
166 "transcript_json": "transcript/transcript.json",
167 "transcript_txt": "transcript/transcript.txt",
168 "transcript_srt": "transcript/transcript.srt",
169 "analysis_md": "results/analysis.md",
170 "knowledge_graph_json": "results/knowledge_graph.json",
171 "knowledge_graph_db": "results/knowledge_graph.db",
172 "key_points_json": "results/key_points.json",
173 "action_items_json": "results/action_items.json",
174 "key_points": [...],
175 "action_items": [...],
176 "diagrams": [...],
177 "screen_captures": [...]
178 }
179 ```
180
181 ## Checkpoint and resume
182
183 The pipeline supports checkpoint/resume. If a step's output files already exist on disk, that step is skipped on re-run. This means you can safely re-run an interrupted analysis and it will pick up where it left off:
184
185 ```bash
186 # First run (interrupted at step 6)
187 planopticon analyze -i video.mp4 -o ./output
188
189 # Second run (resumes from step 6)
190 planopticon analyze -i video.mp4 -o ./output
191 ```
192
193 ## Using results after analysis
194
195 ### Query the knowledge graph
196
197 After analysis completes, you can query the knowledge graph directly:
198
199 ```bash
200 # Show graph stats
201 planopticon query --db results/knowledge_graph.db
202
203 # List entities by type
204 planopticon query --db results/knowledge_graph.db "entities --type technology"
205
206 # Find neighbors of an entity
207 planopticon query --db results/knowledge_graph.db "neighbors Kubernetes"
208
209 # Ask natural language questions (requires API key)
210 planopticon query --db results/knowledge_graph.db "What technologies were discussed?"
211 ```
212
213 ### Classify entities for planning
214
215 Run taxonomy classification to categorize entities into planning types (goal, milestone, risk, dependency, etc.):
216
217 ```bash
218 planopticon kg classify results/knowledge_graph.db
219 planopticon kg classify results/knowledge_graph.db --format json
220 ```
221
222 ### Export to other formats
223
224 ```bash
225 # Generate markdown documents
226 planopticon export markdown results/knowledge_graph.db -o ./docs
227
228 # Export as Obsidian vault
229 planopticon export obsidian results/knowledge_graph.db -o ./vault
230
231 # Export as PlanOpticonExchange
232 planopticon export exchange results/knowledge_graph.db -o exchange.json
233
234 # Generate GitHub wiki
235 planopticon wiki generate results/knowledge_graph.db -o ./wiki
236 ```
237
238 ### Use with the planning agent
239
240 The planning agent can consume the knowledge graph to generate project plans, PRDs, roadmaps, and other planning artifacts:
241
242 ```bash
243 planopticon agent --db results/knowledge_graph.db
244 ```
245
246 DDED docs/use-cases.md
--- a/docs/use-cases.md
+++ b/docs/use-cases.md
@@ -0,0 +1,342 @@
1
+# Use Cases
2
+
3
+PlanOpticon is built for anyone who needs to turn unstructured content — recordings, documents, notes, web pages — into structured, searchable, actionable knowledge. Here are the most common ways people use it.
4
+
5
+---
6
+
7
+## Meeting notes and follow-ups
8
+
9
+**Problem:** You have hours of meeting recordings but no time to rewatch them. Action items get lost, decisions are forgotten, and new team members have no way to catch up.
10
+
11
+**Solution:** Point PlanOpticon at your recordings and get structured transcripts, action items with assignees and deadlines, key decisions, and a knowledge graph linking people to topics.
12
+
13
+```bash
14
+# Analyze a single meeting recording
15
+planopticon analyze -i standup-2026-03-07.mp4 -o ./meetings/march-7
16
+
17
+# Process a month of recordings at once
18
+planopticon batch -i ./recordings/march -o ./meetings --title "March 2026 Meetings"
19
+
20
+# Query what was decided
21
+planopticon query "What decisions were made about the API redesign?"
22
+
23
+# Find all action items assigned to Alice
24
+planopticon query "relationships --source Alice"
25
+```
26
+
27
+**What you get:**
28
+
29
+- Full transcript with timestamps and speaker segments
30
+- Action items extracted with assignees, deadlines, and context
31
+- Key points and decisions highlighted
32
+- Knowledge graph connecting people, topics, technologies, and decisions
33
+- Markdown report you can share with the team
34
+
35
+**Next steps:** Export to your team's wiki or note system:
36
+
37
+```bash
38
+# Push to GitHub wiki
39
+planopticon wiki generate --input ./meetings --output ./wiki
40
+planopticon wiki push --input ./wiki --target "github://your-org/your-repo"
41
+
42
+# Export to Obsidian for personal knowledge management
43
+planopticon export obsidian --input ./meetings --output ~/Obsidian/Meetings
44
+```
45
+
46
+---
47
+
48
+## Research processing
49
+
50
+**Problem:** You're researching a topic across YouTube talks, arXiv papers, blog posts, and podcasts. Information is scattered and hard to cross-reference.
51
+
52
+**Solution:** Ingest everything into a single knowledge graph, then query across all sources.
53
+
54
+```bash
55
+# Ingest a YouTube conference talk
56
+planopticon ingest "https://youtube.com/watch?v=..." --output ./research
57
+
58
+# Ingest arXiv papers
59
+planopticon ingest "https://arxiv.org/abs/2401.12345" --output ./research
60
+
61
+# Ingest blog posts and documentation
62
+planopticon ingest "https://example.com/blog/post" --output ./research
63
+
64
+# Ingest local PDF papers
65
+planopticon ingest ./papers/ --output ./research --recursive
66
+
67
+# Now query across everything
68
+planopticon query "What approaches to vector search were discussed?"
69
+planopticon query "entities --type technology"
70
+planopticon query "neighbors TransformerArchitecture"
71
+```
72
+
73
+**What you get:**
74
+
75
+- A unified knowledge graph merging entities across all sources
76
+- Cross-references showing where the same concept appears in different sources
77
+- Searchable entity index by type (people, technologies, concepts, papers)
78
+- Relationship maps showing how ideas connect
79
+
80
+**Go deeper with the companion:**
81
+
82
+```bash
83
+planopticon companion --kb ./research
84
+```
85
+
86
+```
87
+planopticon> What are the main approaches to retrieval-augmented generation?
88
+planopticon> /entities --type technology
89
+planopticon> /neighbors RAG
90
+planopticon> /export obsidian
91
+```
92
+
93
+---
94
+
95
+## Knowledge gathering across platforms
96
+
97
+**Problem:** Your team's knowledge is spread across Google Docs, Notion, Obsidian, GitHub wikis, and Apple Notes. There's no single place to search everything.
98
+
99
+**Solution:** Pull from all sources into one knowledge graph.
100
+
101
+```bash
102
+# Authenticate with your platforms
103
+planopticon auth google
104
+planopticon auth notion
105
+planopticon auth github
106
+
107
+# Ingest from Google Workspace
108
+planopticon gws ingest --folder-id abc123 --output ./kb --recursive
109
+
110
+# Ingest from Notion
111
+planopticon ingest --source notion --output ./kb
112
+
113
+# Ingest from an Obsidian vault
114
+planopticon ingest ~/Obsidian/WorkVault --output ./kb --recursive
115
+
116
+# Ingest from GitHub wikis and READMEs
117
+planopticon ingest "github://your-org/project-a" --output ./kb
118
+planopticon ingest "github://your-org/project-b" --output ./kb
119
+
120
+# Query the unified knowledge base
121
+planopticon query stats
122
+planopticon query "entities --type person"
123
+planopticon query "What do we know about the authentication system?"
124
+```
125
+
126
+**What you get:**
127
+
128
+- Merged knowledge graph with provenance tracking (you can see which source each entity came from)
129
+- Deduplicated entities across platforms (same concept mentioned in Notion and Google Docs gets merged)
130
+- Full-text search across all ingested content
131
+- Relationship maps showing how concepts connect across your organization's documents
132
+
133
+---
134
+
135
+## Team onboarding
136
+
137
+**Problem:** New team members spend weeks reading docs, watching recorded meetings, and asking questions to get up to speed.
138
+
139
+**Solution:** Build a knowledge base from existing content and let new people explore it conversationally.
140
+
141
+```bash
142
+# Build the knowledge base from everything
143
+planopticon batch -i ./recordings/onboarding -o ./kb --title "Team Onboarding"
144
+planopticon ingest ./docs/ --output ./kb --recursive
145
+planopticon ingest ./architecture-decisions/ --output ./kb --recursive
146
+
147
+# New team member launches the companion
148
+planopticon companion --kb ./kb
149
+```
150
+
151
+```
152
+planopticon> What is the overall architecture of the system?
153
+planopticon> Who are the key people on the team?
154
+planopticon> /entities --type technology
155
+planopticon> What was the rationale for choosing PostgreSQL over MongoDB?
156
+planopticon> /neighbors AuthenticationService
157
+planopticon> What are the main open issues or risks?
158
+```
159
+
160
+**What you get:**
161
+
162
+- Interactive Q&A over the entire team knowledge base
163
+- Entity exploration — browse people, technologies, services, decisions
164
+- Relationship navigation — "show me everything connected to the payment system"
165
+- No need to rewatch hours of recordings
166
+
167
+---
168
+
169
+## Data collection and synthesis
170
+
171
+**Problem:** You need to collect and synthesize information from many sources — customer interviews, competitor analysis, market research — into a coherent picture.
172
+
173
+**Solution:** Batch process recordings and documents, then use the planning agent to generate synthesis artifacts.
174
+
175
+```bash
176
+# Process customer interview recordings
177
+planopticon batch -i ./interviews -o ./research --title "Customer Interviews Q1"
178
+
179
+# Ingest competitor documentation
180
+planopticon ingest ./competitor-analysis/ --output ./research --recursive
181
+
182
+# Ingest market research PDFs
183
+planopticon ingest ./market-reports/ --output ./research --recursive
184
+
185
+# Use the planning agent to synthesize
186
+planopticon agent --kb ./research --interactive
187
+```
188
+
189
+```
190
+planopticon> Generate a summary of common customer pain points
191
+planopticon> /plan
192
+planopticon> /tasks
193
+planopticon> /export markdown
194
+```
195
+
196
+**What you get:**
197
+
198
+- Merged knowledge graph across all interviews and documents
199
+- Cross-referenced entities showing which customers mentioned which features
200
+- Agent-generated project plans, PRDs, and task breakdowns based on the data
201
+- Exportable artifacts for sharing with stakeholders
202
+
203
+---
204
+
205
+## Content creation from video
206
+
207
+**Problem:** You have video content (lectures, tutorials, webinars) that you want to turn into written documentation, blog posts, or course materials.
208
+
209
+**Solution:** Extract structured knowledge and export it in your preferred format.
210
+
211
+```bash
212
+# Analyze the video
213
+planopticon analyze -i webinar-recording.mp4 -o ./content
214
+
215
+# Generate multiple document types (no LLM needed for these)
216
+planopticon export markdown --input ./content --output ./docs
217
+
218
+# Export to Obsidian for further editing
219
+planopticon export obsidian --input ./content --output ~/Obsidian/Content
220
+```
221
+
222
+**What you get for each video:**
223
+
224
+- Full transcript (JSON, plain text, SRT subtitles)
225
+- Extracted diagrams reproduced as Mermaid/SVG/PNG
226
+- Charts reproduced with data tables
227
+- Knowledge graph of concepts and relationships
228
+- 7 types of markdown documents: summary, meeting notes, glossary, relationship map, status report, entity index, CSV data
229
+
230
+---
231
+
232
+## Decision tracking over time
233
+
234
+**Problem:** Important decisions are made in meetings but never formally recorded. Months later, nobody remembers why a choice was made.
235
+
236
+**Solution:** Process meeting recordings continuously and query the growing knowledge graph for decisions and their context.
237
+
238
+```bash
239
+# Process each week's recordings
240
+planopticon batch -i ./recordings/week-12 -o ./decisions --title "Week 12"
241
+
242
+# The knowledge graph grows over time — entities merge across weeks
243
+planopticon query "entities --type goal"
244
+planopticon query "entities --type risk"
245
+planopticon query "entities --type milestone"
246
+
247
+# Find decisions about a specific topic
248
+planopticon query "What was decided about the database migration?"
249
+
250
+# Track risks over time
251
+planopticon query "relationships --type risk"
252
+```
253
+
254
+The planning taxonomy automatically classifies entities as goals, requirements, risks, tasks, and milestones — giving you a structured view of project evolution over time.
255
+
256
+---
257
+
258
+## Zoom / Teams / Meet integration
259
+
260
+**Problem:** Meeting recordings are sitting in Zoom/Teams/Meet cloud storage. You want to process them without manually downloading each one.
261
+
262
+**Solution:** Authenticate once, list recordings, and process them directly.
263
+
264
+```bash
265
+# Authenticate with your meeting platform
266
+planopticon auth zoom
267
+# or: planopticon auth microsoft
268
+# or: planopticon auth google
269
+
270
+# List recent recordings
271
+planopticon recordings zoom-list
272
+planopticon recordings teams-list --from 2026-01-01
273
+planopticon recordings meet-list --limit 20
274
+
275
+# Process recordings (download + analyze)
276
+planopticon analyze -i "zoom://recording-id" -o ./output
277
+```
278
+
279
+**Setup requirements:**
280
+
281
+| Platform | What you need |
282
+|----------|--------------|
283
+| Zoom | `ZOOM_CLIENT_ID` + `ZOOM_CLIENT_SECRET` (create an OAuth app at marketplace.zoom.us) |
284
+| Teams | `MICROSOFT_CLIENT_ID` + `MICROSOFT_CLIENT_SECRET` (register an Azure AD app) |
285
+| Meet | `GOOGLE_CLIENT_ID` + `GOOGLE_CLIENT_SECRET` (create OAuth credentials in Google Cloud Console) |
286
+
287
+See the [Authentication guide](guide/authentication.md) for detailed setup instructions.
288
+
289
+---
290
+
291
+## Fully offline processing
292
+
293
+**Problem:** You're working with sensitive content that can't leave your network, or you simply don't want to pay for API calls.
294
+
295
+**Solution:** Use Ollama for local AI processing with no external API calls.
296
+
297
+```bash
298
+# Install Ollama and pull models
299
+ollama pull llama3.2 # Chat/analysis
300
+ollama pull llava # Vision (diagram detection)
301
+
302
+# Install local Whisper for transcription
303
+pip install planopticon[gpu]
304
+
305
+# Process entirely offline
306
+planopticon analyze -i sensitive-meeting.mp4 -o ./output --provider ollama
307
+```
308
+
309
+PlanOpticon auto-detects Ollama when it's running. If no cloud API keys are configured, it uses Ollama automatically. Pair with local Whisper transcription for a fully air-gapped pipeline.
310
+
311
+---
312
+
313
+## Competitive research
314
+
315
+**Problem:** You want to systematically analyze competitor content — conference talks, documentation, blog posts — and identify patterns.
316
+
317
+**Solution:** Ingest competitor content from multiple sources and query for patterns.
318
+
319
+```bash
320
+# Ingest competitor conference talks from YouTube
321
+planopticon ingest "https://youtube.com/watch?v=competitor-talk-1" --output ./competitive
322
+planopticon ingest "https://youtube.com/watch?v=competitor-talk-2" --output ./competitive
323
+
324
+# Ingest their documentation
325
+planopticon ingest "https://competitor.com/docs" --output ./competitive
326
+
327
+# Ingest their GitHub repos
328
+planopticon auth github
329
+planopticon ingest "github://competitor/main-product" --output ./competitive
330
+
331
+# Analyze patterns
332
+planopticon query "entities --type technology"
333
+planopticon query "What technologies are competitors investing in?"
334
+planopticon companion --kb ./competitive
335
+```
336
+
337
+```
338
+planopticon> What are the common architectural patterns across competitors?
339
+planopticon> /entities --type technology
340
+planopticon> Which technologies appear most frequently?
341
+planopticon> /export markdown
342
+```
--- a/docs/use-cases.md
+++ b/docs/use-cases.md
@@ -0,0 +1,342 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
--- a/docs/use-cases.md
+++ b/docs/use-cases.md
@@ -0,0 +1,342 @@
1 # Use Cases
2
3 PlanOpticon is built for anyone who needs to turn unstructured content — recordings, documents, notes, web pages — into structured, searchable, actionable knowledge. Here are the most common ways people use it.
4
5 ---
6
7 ## Meeting notes and follow-ups
8
9 **Problem:** You have hours of meeting recordings but no time to rewatch them. Action items get lost, decisions are forgotten, and new team members have no way to catch up.
10
11 **Solution:** Point PlanOpticon at your recordings and get structured transcripts, action items with assignees and deadlines, key decisions, and a knowledge graph linking people to topics.
12
13 ```bash
14 # Analyze a single meeting recording
15 planopticon analyze -i standup-2026-03-07.mp4 -o ./meetings/march-7
16
17 # Process a month of recordings at once
18 planopticon batch -i ./recordings/march -o ./meetings --title "March 2026 Meetings"
19
20 # Query what was decided
21 planopticon query "What decisions were made about the API redesign?"
22
23 # Find all action items assigned to Alice
24 planopticon query "relationships --source Alice"
25 ```
26
27 **What you get:**
28
29 - Full transcript with timestamps and speaker segments
30 - Action items extracted with assignees, deadlines, and context
31 - Key points and decisions highlighted
32 - Knowledge graph connecting people, topics, technologies, and decisions
33 - Markdown report you can share with the team
34
35 **Next steps:** Export to your team's wiki or note system:
36
37 ```bash
38 # Push to GitHub wiki
39 planopticon wiki generate --input ./meetings --output ./wiki
40 planopticon wiki push --input ./wiki --target "github://your-org/your-repo"
41
42 # Export to Obsidian for personal knowledge management
43 planopticon export obsidian --input ./meetings --output ~/Obsidian/Meetings
44 ```
45
46 ---
47
48 ## Research processing
49
50 **Problem:** You're researching a topic across YouTube talks, arXiv papers, blog posts, and podcasts. Information is scattered and hard to cross-reference.
51
52 **Solution:** Ingest everything into a single knowledge graph, then query across all sources.
53
54 ```bash
55 # Ingest a YouTube conference talk
56 planopticon ingest "https://youtube.com/watch?v=..." --output ./research
57
58 # Ingest arXiv papers
59 planopticon ingest "https://arxiv.org/abs/2401.12345" --output ./research
60
61 # Ingest blog posts and documentation
62 planopticon ingest "https://example.com/blog/post" --output ./research
63
64 # Ingest local PDF papers
65 planopticon ingest ./papers/ --output ./research --recursive
66
67 # Now query across everything
68 planopticon query "What approaches to vector search were discussed?"
69 planopticon query "entities --type technology"
70 planopticon query "neighbors TransformerArchitecture"
71 ```
72
73 **What you get:**
74
75 - A unified knowledge graph merging entities across all sources
76 - Cross-references showing where the same concept appears in different sources
77 - Searchable entity index by type (people, technologies, concepts, papers)
78 - Relationship maps showing how ideas connect
79
80 **Go deeper with the companion:**
81
82 ```bash
83 planopticon companion --kb ./research
84 ```
85
86 ```
87 planopticon> What are the main approaches to retrieval-augmented generation?
88 planopticon> /entities --type technology
89 planopticon> /neighbors RAG
90 planopticon> /export obsidian
91 ```
92
93 ---
94
95 ## Knowledge gathering across platforms
96
97 **Problem:** Your team's knowledge is spread across Google Docs, Notion, Obsidian, GitHub wikis, and Apple Notes. There's no single place to search everything.
98
99 **Solution:** Pull from all sources into one knowledge graph.
100
101 ```bash
102 # Authenticate with your platforms
103 planopticon auth google
104 planopticon auth notion
105 planopticon auth github
106
107 # Ingest from Google Workspace
108 planopticon gws ingest --folder-id abc123 --output ./kb --recursive
109
110 # Ingest from Notion
111 planopticon ingest --source notion --output ./kb
112
113 # Ingest from an Obsidian vault
114 planopticon ingest ~/Obsidian/WorkVault --output ./kb --recursive
115
116 # Ingest from GitHub wikis and READMEs
117 planopticon ingest "github://your-org/project-a" --output ./kb
118 planopticon ingest "github://your-org/project-b" --output ./kb
119
120 # Query the unified knowledge base
121 planopticon query stats
122 planopticon query "entities --type person"
123 planopticon query "What do we know about the authentication system?"
124 ```
125
126 **What you get:**
127
128 - Merged knowledge graph with provenance tracking (you can see which source each entity came from)
129 - Deduplicated entities across platforms (same concept mentioned in Notion and Google Docs gets merged)
130 - Full-text search across all ingested content
131 - Relationship maps showing how concepts connect across your organization's documents
132
133 ---
134
135 ## Team onboarding
136
137 **Problem:** New team members spend weeks reading docs, watching recorded meetings, and asking questions to get up to speed.
138
139 **Solution:** Build a knowledge base from existing content and let new people explore it conversationally.
140
141 ```bash
142 # Build the knowledge base from everything
143 planopticon batch -i ./recordings/onboarding -o ./kb --title "Team Onboarding"
144 planopticon ingest ./docs/ --output ./kb --recursive
145 planopticon ingest ./architecture-decisions/ --output ./kb --recursive
146
147 # New team member launches the companion
148 planopticon companion --kb ./kb
149 ```
150
151 ```
152 planopticon> What is the overall architecture of the system?
153 planopticon> Who are the key people on the team?
154 planopticon> /entities --type technology
155 planopticon> What was the rationale for choosing PostgreSQL over MongoDB?
156 planopticon> /neighbors AuthenticationService
157 planopticon> What are the main open issues or risks?
158 ```
159
160 **What you get:**
161
162 - Interactive Q&A over the entire team knowledge base
163 - Entity exploration — browse people, technologies, services, decisions
164 - Relationship navigation — "show me everything connected to the payment system"
165 - No need to rewatch hours of recordings
166
167 ---
168
169 ## Data collection and synthesis
170
171 **Problem:** You need to collect and synthesize information from many sources — customer interviews, competitor analysis, market research — into a coherent picture.
172
173 **Solution:** Batch process recordings and documents, then use the planning agent to generate synthesis artifacts.
174
175 ```bash
176 # Process customer interview recordings
177 planopticon batch -i ./interviews -o ./research --title "Customer Interviews Q1"
178
179 # Ingest competitor documentation
180 planopticon ingest ./competitor-analysis/ --output ./research --recursive
181
182 # Ingest market research PDFs
183 planopticon ingest ./market-reports/ --output ./research --recursive
184
185 # Use the planning agent to synthesize
186 planopticon agent --kb ./research --interactive
187 ```
188
189 ```
190 planopticon> Generate a summary of common customer pain points
191 planopticon> /plan
192 planopticon> /tasks
193 planopticon> /export markdown
194 ```
195
196 **What you get:**
197
198 - Merged knowledge graph across all interviews and documents
199 - Cross-referenced entities showing which customers mentioned which features
200 - Agent-generated project plans, PRDs, and task breakdowns based on the data
201 - Exportable artifacts for sharing with stakeholders
202
203 ---
204
205 ## Content creation from video
206
207 **Problem:** You have video content (lectures, tutorials, webinars) that you want to turn into written documentation, blog posts, or course materials.
208
209 **Solution:** Extract structured knowledge and export it in your preferred format.
210
211 ```bash
212 # Analyze the video
213 planopticon analyze -i webinar-recording.mp4 -o ./content
214
215 # Generate multiple document types (no LLM needed for these)
216 planopticon export markdown --input ./content --output ./docs
217
218 # Export to Obsidian for further editing
219 planopticon export obsidian --input ./content --output ~/Obsidian/Content
220 ```
221
222 **What you get for each video:**
223
224 - Full transcript (JSON, plain text, SRT subtitles)
225 - Extracted diagrams reproduced as Mermaid/SVG/PNG
226 - Charts reproduced with data tables
227 - Knowledge graph of concepts and relationships
228 - 7 types of markdown documents: summary, meeting notes, glossary, relationship map, status report, entity index, CSV data
229
230 ---
231
232 ## Decision tracking over time
233
234 **Problem:** Important decisions are made in meetings but never formally recorded. Months later, nobody remembers why a choice was made.
235
236 **Solution:** Process meeting recordings continuously and query the growing knowledge graph for decisions and their context.
237
238 ```bash
239 # Process each week's recordings
240 planopticon batch -i ./recordings/week-12 -o ./decisions --title "Week 12"
241
242 # The knowledge graph grows over time — entities merge across weeks
243 planopticon query "entities --type goal"
244 planopticon query "entities --type risk"
245 planopticon query "entities --type milestone"
246
247 # Find decisions about a specific topic
248 planopticon query "What was decided about the database migration?"
249
250 # Track risks over time
251 planopticon query "relationships --type risk"
252 ```
253
254 The planning taxonomy automatically classifies entities as goals, requirements, risks, tasks, and milestones — giving you a structured view of project evolution over time.
255
256 ---
257
258 ## Zoom / Teams / Meet integration
259
260 **Problem:** Meeting recordings are sitting in Zoom/Teams/Meet cloud storage. You want to process them without manually downloading each one.
261
262 **Solution:** Authenticate once, list recordings, and process them directly.
263
264 ```bash
265 # Authenticate with your meeting platform
266 planopticon auth zoom
267 # or: planopticon auth microsoft
268 # or: planopticon auth google
269
270 # List recent recordings
271 planopticon recordings zoom-list
272 planopticon recordings teams-list --from 2026-01-01
273 planopticon recordings meet-list --limit 20
274
275 # Process recordings (download + analyze)
276 planopticon analyze -i "zoom://recording-id" -o ./output
277 ```
278
279 **Setup requirements:**
280
281 | Platform | What you need |
282 |----------|--------------|
283 | Zoom | `ZOOM_CLIENT_ID` + `ZOOM_CLIENT_SECRET` (create an OAuth app at marketplace.zoom.us) |
284 | Teams | `MICROSOFT_CLIENT_ID` + `MICROSOFT_CLIENT_SECRET` (register an Azure AD app) |
285 | Meet | `GOOGLE_CLIENT_ID` + `GOOGLE_CLIENT_SECRET` (create OAuth credentials in Google Cloud Console) |
286
287 See the [Authentication guide](guide/authentication.md) for detailed setup instructions.
288
289 ---
290
291 ## Fully offline processing
292
293 **Problem:** You're working with sensitive content that can't leave your network, or you simply don't want to pay for API calls.
294
295 **Solution:** Use Ollama for local AI processing with no external API calls.
296
297 ```bash
298 # Install Ollama and pull models
299 ollama pull llama3.2 # Chat/analysis
300 ollama pull llava # Vision (diagram detection)
301
302 # Install local Whisper for transcription
303 pip install planopticon[gpu]
304
305 # Process entirely offline
306 planopticon analyze -i sensitive-meeting.mp4 -o ./output --provider ollama
307 ```
308
309 PlanOpticon auto-detects Ollama when it's running. If no cloud API keys are configured, it uses Ollama automatically. Pair with local Whisper transcription for a fully air-gapped pipeline.
310
311 ---
312
313 ## Competitive research
314
315 **Problem:** You want to systematically analyze competitor content — conference talks, documentation, blog posts — and identify patterns.
316
317 **Solution:** Ingest competitor content from multiple sources and query for patterns.
318
319 ```bash
320 # Ingest competitor conference talks from YouTube
321 planopticon ingest "https://youtube.com/watch?v=competitor-talk-1" --output ./competitive
322 planopticon ingest "https://youtube.com/watch?v=competitor-talk-2" --output ./competitive
323
324 # Ingest their documentation
325 planopticon ingest "https://competitor.com/docs" --output ./competitive
326
327 # Ingest their GitHub repos
328 planopticon auth github
329 planopticon ingest "github://competitor/main-product" --output ./competitive
330
331 # Analyze patterns
332 planopticon query "entities --type technology"
333 planopticon query "What technologies are competitors investing in?"
334 planopticon companion --kb ./competitive
335 ```
336
337 ```
338 planopticon> What are the common architectural patterns across competitors?
339 planopticon> /entities --type technology
340 planopticon> Which technologies appear most frequently?
341 planopticon> /export markdown
342 ```
+11
--- mkdocs.yml
+++ mkdocs.yml
@@ -79,21 +79,32 @@
7979
- Quick Start: getting-started/quickstart.md
8080
- Configuration: getting-started/configuration.md
8181
- User Guide:
8282
- Single Video Analysis: guide/single-video.md
8383
- Batch Processing: guide/batch.md
84
+ - Document Ingestion: guide/document-ingestion.md
8485
- Cloud Sources: guide/cloud-sources.md
86
+ - Knowledge Graphs: guide/knowledge-graphs.md
87
+ - Interactive Companion: guide/companion.md
88
+ - Planning Agent: guide/planning-agent.md
89
+ - Authentication: guide/authentication.md
90
+ - Export & Documents: guide/export.md
8591
- Output Formats: guide/output-formats.md
92
+ - Use Cases: use-cases.md
8693
- CLI Reference: cli-reference.md
8794
- Architecture:
8895
- Overview: architecture/overview.md
8996
- Provider System: architecture/providers.md
9097
- Processing Pipeline: architecture/pipeline.md
9198
- API Reference:
9299
- Models: api/models.md
93100
- Providers: api/providers.md
94101
- Analyzers: api/analyzers.md
102
+ - Agent & Skills: api/agent.md
103
+ - Sources: api/sources.md
104
+ - Authentication: api/auth.md
105
+ - FAQ & Troubleshooting: faq.md
95106
- Contributing: contributing.md
96107
97108
extra:
98109
social:
99110
- icon: fontawesome/brands/github
100111
--- mkdocs.yml
+++ mkdocs.yml
@@ -79,21 +79,32 @@
79 - Quick Start: getting-started/quickstart.md
80 - Configuration: getting-started/configuration.md
81 - User Guide:
82 - Single Video Analysis: guide/single-video.md
83 - Batch Processing: guide/batch.md
 
84 - Cloud Sources: guide/cloud-sources.md
 
 
 
 
 
85 - Output Formats: guide/output-formats.md
 
86 - CLI Reference: cli-reference.md
87 - Architecture:
88 - Overview: architecture/overview.md
89 - Provider System: architecture/providers.md
90 - Processing Pipeline: architecture/pipeline.md
91 - API Reference:
92 - Models: api/models.md
93 - Providers: api/providers.md
94 - Analyzers: api/analyzers.md
 
 
 
 
95 - Contributing: contributing.md
96
97 extra:
98 social:
99 - icon: fontawesome/brands/github
100
--- mkdocs.yml
+++ mkdocs.yml
@@ -79,21 +79,32 @@
79 - Quick Start: getting-started/quickstart.md
80 - Configuration: getting-started/configuration.md
81 - User Guide:
82 - Single Video Analysis: guide/single-video.md
83 - Batch Processing: guide/batch.md
84 - Document Ingestion: guide/document-ingestion.md
85 - Cloud Sources: guide/cloud-sources.md
86 - Knowledge Graphs: guide/knowledge-graphs.md
87 - Interactive Companion: guide/companion.md
88 - Planning Agent: guide/planning-agent.md
89 - Authentication: guide/authentication.md
90 - Export & Documents: guide/export.md
91 - Output Formats: guide/output-formats.md
92 - Use Cases: use-cases.md
93 - CLI Reference: cli-reference.md
94 - Architecture:
95 - Overview: architecture/overview.md
96 - Provider System: architecture/providers.md
97 - Processing Pipeline: architecture/pipeline.md
98 - API Reference:
99 - Models: api/models.md
100 - Providers: api/providers.md
101 - Analyzers: api/analyzers.md
102 - Agent & Skills: api/agent.md
103 - Sources: api/sources.md
104 - Authentication: api/auth.md
105 - FAQ & Troubleshooting: faq.md
106 - Contributing: contributing.md
107
108 extra:
109 social:
110 - icon: fontawesome/brands/github
111

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button