+- **Interactive companion REPL** (`planopticon companion` / `planopticon --chat`) with auto-discovery of knowledge graphs, videos, and documents in the workspace. 15 slash commands for graph exploration, ingestion, export, auth, and runtime provider/model switching.
14
+- **20+ source connectors**: YouTube, Web, GitHub, Reddit, HackerNews, RSS, Podcast, Twitter/X, arXiv, S3, Google Workspace (Docs, Sheets, Slides), Microsoft 365 (SharePoint, OneDrive), Obsidian, Notion, Apple Notes, OneNote, Google Keep, Logseq, Zoom (OAuth), Teams, Google Meet.
- **Interactive companion REPL** (`planopticon companion` / `planopticon --chat`) with auto-discovery of knowledge graphs, videos, and documents in the workspace. 15 slash commands for graph exploration, ingestion, export, auth, and runtime provider/model switching.
14
- **20+ source connectors**: YouTube, Web, GitHub, Reddit, HackerNews, RSS, Podcast, Twitter/X, arXiv, S3, Google Workspace (Docs, Sheets, Slides), Microsoft 365 (SharePoint, OneDrive), Obsidian, Notion, Apple Notes, OneNote, Google Keep, Logseq, Zoom (OAuth), Teams, Google Meet.
-**AI-powered video analysis and knowledge extraction.**
9
+**AI-powered video analysis, knowledge extraction, and planning.**
10
10
11
-PlanOpticon processes video recordings into structured knowledge — transcripts, diagrams, action items, key points, and knowledge graphs. It auto-discovers available models across OpenAI, Anthropic, Gemini, and Ollama, and produces rich multi-format output.
11
+PlanOpticon processes video recordings, documents, and 20+ online sources into structured knowledge graphs, then helps you plan with an AI agent and interactive companion. It auto-discovers models across 15+ AI providers, runs fully offline with Ollama, and produces rich multi-format output.
12
12
13
13
## Features
14
14
15
-- **Multi-provider AI** — Auto-discovers and routes to the best available model across OpenAI, Anthropic, Google Gemini, and Ollama (fully offline)
16
-- **Smart frame extraction** — Change detection for transitions + periodic capture for slow-evolving content (document scrolling, screen shares)
17
-- **People frame filtering** — OpenCV face detection automatically removes webcam/video conference frames, keeping only shared content
18
-- **Diagram extraction** — Vision model classification detects flowcharts, architecture diagrams, charts, and whiteboards
19
-- **Knowledge graphs** — Extracts entities and relationships, builds and merges knowledge graphs across videos
20
-- **Action item detection** — Finds commitments, tasks, and follow-ups with assignees and deadlines
21
-- **Batch processing** — Process entire folders of videos with merged knowledge graphs and cross-referencing
+- **20+ source connectors** -- YouTube, web pages, GitHub, Reddit, HackerNews, RSS, podcasts, arXiv, S3, Google Workspace, Microsoft 365, Obsidian, Notion, Apple Notes, Zoom, Teams, Google Meet, and more.
17
+- **Planning agent** -- 11 skills including project plans, PRDs, roadmaps, task breakdowns, and GitHub integration.
18
+- **Interactive companion** -- Chat REPL with 15 slash commands, auto-discovery of workspace knowledge, and runtime provider/model switching.
19
+- **Knowledge graphs** -- SQLite-backed (zero external deps), entity extraction with planning taxonomy (goals, requirements, risks, tasks, milestones), merge and dedup across sources.
20
+- **Smart video analysis** -- Change-detection frame extraction, face filtering, diagram classification, action item detection, checkpoint/resume.
21
+- **Document ingestion** -- PDF, Markdown, and plaintext pipelines feed the same knowledge graph.
22
+- **Export everywhere** -- Markdown docs (7 types, no LLM required), Obsidian vaults, Notion markdown, GitHub wiki with push, PlanOpticonExchange JSON interchange, HTML/PDF reports, Mermaid diagrams.
23
+- **OAuth-first auth** -- Unified OAuth manager for Google, Dropbox, Zoom, Notion, GitHub, and Microsoft with saved-token / PKCE / API-key fallback chain.
24
+- **Batch processing** -- Process entire folders with merged knowledge graphs and cross-referencing.
+- **PlanOpticonExchange** -- Canonical JSON interchange with merge/dedup
26
113
27
114
## Local Run
28
115
29
-PlanOpticon runs entirely offline with Ollama — no API keys, no cloud, no cost.
116
+PlanOpticon runs entirely offline with Ollama -- no API keys, no cloud, no cost.
30
117
31
118
> **13.2 hours of video content analyzed, knowledge-graphed, and summarized in ~25 hours of processing time, entirely on local hardware, for free.**
32
119
33
120
18 meeting recordings processed on a single machine using `llava` (vision), `qwen3:30b` (chat), and `whisper-large` (transcription via Apple Silicon GPU):
34
121
@@ -41,33 +128,17 @@
41
128
| API calls (local) | 1,841 |
42
129
| Tokens processed | 4.87M |
43
130
| Total cost | **$0.00** |
44
131
45
132
```bash
46
-# Fully local analysis — no API keys needed, just Ollama running
133
+# Fully local analysis -- no API keys needed, just Ollama running
**AI-powered video analysis and knowledge extraction.**
10
11
PlanOpticon processes video recordings into structured knowledge — transcripts, diagrams, action items, key points, and knowledge graphs. It auto-discovers available models across OpenAI, Anthropic, Gemini, and Ollama, and produces rich multi-format output.
12
13
## Features
14
15
- **Multi-provider AI** — Auto-discovers and routes to the best available model across OpenAI, Anthropic, Google Gemini, and Ollama (fully offline)
16
- **Smart frame extraction** — Change detection for transitions + periodic capture for slow-evolving content (document scrolling, screen shares)
17
- **People frame filtering** — OpenCV face detection automatically removes webcam/video conference frames, keeping only shared content
18
- **Diagram extraction** — Vision model classification detects flowcharts, architecture diagrams, charts, and whiteboards
19
- **Knowledge graphs** — Extracts entities and relationships, builds and merges knowledge graphs across videos
20
- **Action item detection** — Finds commitments, tasks, and follow-ups with assignees and deadlines
21
- **Batch processing** — Process entire folders of videos with merged knowledge graphs and cross-referencing
- **Cloud sources** — Fetch videos from Google Drive and Dropbox shared folders
24
- **Checkpoint/resume** — Pipeline resumes from where it left off if interrupted
25
- **Screengrab fallback** — When extraction isn't perfect, captures frames with captions — something is always better than nothing
26
27
## Local Run
28
29
PlanOpticon runs entirely offline with Ollama — no API keys, no cloud, no cost.
30
31
> **13.2 hours of video content analyzed, knowledge-graphed, and summarized in ~25 hours of processing time, entirely on local hardware, for free.**
32
33
18 meeting recordings processed on a single machine using `llava` (vision), `qwen3:30b` (chat), and `whisper-large` (transcription via Apple Silicon GPU):
34
@@ -41,33 +128,17 @@
41
| API calls (local) | 1,841 |
42
| Tokens processed | 4.87M |
43
| Total cost | **$0.00** |
44
45
```bash
46
# Fully local analysis — no API keys needed, just Ollama running
**AI-powered video analysis, knowledge extraction, and planning.**
10
11
PlanOpticon processes video recordings, documents, and 20+ online sources into structured knowledge graphs, then helps you plan with an AI agent and interactive companion. It auto-discovers models across 15+ AI providers, runs fully offline with Ollama, and produces rich multi-format output.
- **20+ source connectors** -- YouTube, web pages, GitHub, Reddit, HackerNews, RSS, podcasts, arXiv, S3, Google Workspace, Microsoft 365, Obsidian, Notion, Apple Notes, Zoom, Teams, Google Meet, and more.
17
- **Planning agent** -- 11 skills including project plans, PRDs, roadmaps, task breakdowns, and GitHub integration.
18
- **Interactive companion** -- Chat REPL with 15 slash commands, auto-discovery of workspace knowledge, and runtime provider/model switching.
19
- **Knowledge graphs** -- SQLite-backed (zero external deps), entity extraction with planning taxonomy (goals, requirements, risks, tasks, milestones), merge and dedup across sources.
20
- **Smart video analysis** -- Change-detection frame extraction, face filtering, diagram classification, action item detection, checkpoint/resume.
21
- **Document ingestion** -- PDF, Markdown, and plaintext pipelines feed the same knowledge graph.
22
- **Export everywhere** -- Markdown docs (7 types, no LLM required), Obsidian vaults, Notion markdown, GitHub wiki with push, PlanOpticonExchange JSON interchange, HTML/PDF reports, Mermaid diagrams.
23
- **OAuth-first auth** -- Unified OAuth manager for Google, Dropbox, Zoom, Notion, GitHub, and Microsoft with saved-token / PKCE / API-key fallback chain.
24
- **Batch processing** -- Process entire folders with merged knowledge graphs and cross-referencing.
- **PlanOpticonExchange** -- Canonical JSON interchange with merge/dedup
113
114
## Local Run
115
116
PlanOpticon runs entirely offline with Ollama -- no API keys, no cloud, no cost.
117
118
> **13.2 hours of video content analyzed, knowledge-graphed, and summarized in ~25 hours of processing time, entirely on local hardware, for free.**
119
120
18 meeting recordings processed on a single machine using `llava` (vision), `qwen3:30b` (chat), and `whisper-large` (transcription via Apple Silicon GPU):
121
@@ -41,33 +128,17 @@
128
| API calls (local) | 1,841 |
129
| Tokens processed | 4.87M |
130
| Total cost | **$0.00** |
131
132
```bash
133
# Fully local analysis -- no API keys needed, just Ollama running
134
planopticon analyze -i meeting.mp4 -o ./output \
135
--provider ollama \
136
--vision-model llava:latest \
137
--chat-model qwen3:30b
138
```
139
140
## Installation
141
142
### From PyPI
143
144
```bash
@@ -109,11 +180,12 @@
180
├── captures/ # Screengrab fallbacks
181
└── results/
182
├── analysis.md # Markdown report
183
├── analysis.html # HTML report
184
├── analysis.pdf # PDF report
185
├── knowledge_graph.db # SQLite knowledge graph
186
├── knowledge_graph.json # JSON export
187
├── key_points.json # Extracted key points
188
└── action_items.json # Tasks and follow-ups
189
```
190
191
## Processing Depth
@@ -128,6 +200,6 @@
200
201
Full documentation at [planopticon.dev](https://planopticon.dev)
- **Pydantic everywhere** — All structured data uses pydantic models for validation and serialization
65
139
- **Manifest-driven** — Every run produces `manifest.json` as the single source of truth
66
-- **Provider abstraction** — Single `ProviderManager` wraps OpenAI, Anthropic, Gemini, and Ollama behind a common interface
140
+- **Provider abstraction** — Single `ProviderManager` wraps OpenAI, Anthropic, Gemini, Ollama, and additional providers behind a common interface
67
141
- **No hardcoded models** — Model lists come from API discovery
68
142
- **Screengrab fallback** — When extraction fails, save the frame as a captioned screenshot
143
+- **OAuth-first auth** — All cloud service integrations use OAuth via `planopticon auth`, with credentials stored locally. Service account keys are supported as a fallback for server-side automation
144
+- **Skill registry** — The planning agent discovers and invokes skills dynamically. Skills are self-describing and can be composed by the agent to accomplish complex tasks
145
+- **Exchange format** — A portable JSON format (`exchange.py`) for importing and exporting knowledge graphs between PlanOpticon instances and external tools
- **Pydantic everywhere** — All structured data uses pydantic models for validation and serialization
139
- **Manifest-driven** — Every run produces `manifest.json` as the single source of truth
140
- **Provider abstraction** — Single `ProviderManager` wraps OpenAI, Anthropic, Gemini, Ollama, and additional providers behind a common interface
141
- **No hardcoded models** — Model lists come from API discovery
142
- **Screengrab fallback** — When extraction fails, save the frame as a captioned screenshot
143
- **OAuth-first auth** — All cloud service integrations use OAuth via `planopticon auth`, with credentials stored locally. Service account keys are supported as a fallback for server-side automation
144
- **Skill registry** — The planning agent discovers and invokes skills dynamically. Skills are self-describing and can be composed by the agent to accomplish complex tasks
145
- **Exchange format** — A portable JSON format (`exchange.py`) for importing and exporting knowledge graphs between PlanOpticon instances and external tools
-PlanOpticon supports multiple AI providers through a unified abstraction layer.
5
+PlanOpticon supports multiple AI providers through a unified abstraction layer. Default models favor cost-effective options (Haiku, GPT-4o-mini, Gemini Flash) for routine tasks, with more capable models available when needed.
6
6
7
7
## Supported providers
8
8
9
-| Provider | Chat | Vision | Transcription |
10
-|----------|------|--------|--------------|
11
-| OpenAI | GPT-4o, GPT-4 | GPT-4o | Whisper-1 |
12
-| Anthropic | Claude Sonnet/Opus | Claude Sonnet/Opus | — |
+| Ollama (local) | Any installed model | llava, moondream, etc. | — (use local Whisper) | `OLLAMA_HOST` |
20
+
21
+## Default models
22
+
23
+PlanOpticon defaults to cheap, fast models for cost efficiency:
24
+
25
+| Task | Default model |
26
+|------|--------------|
27
+| Vision (diagrams) | Gemini Flash |
28
+| Chat (analysis) | Claude Haiku |
29
+| Transcription | Local Whisper (fallback: Whisper-1) |
30
+
31
+Use `--vision-model` and `--chat-model` to override with more capable models when needed (e.g., `--chat-model claude-sonnet-4-20250514` for complex analysis).
15
32
16
33
## Ollama (offline mode)
17
34
18
35
[Ollama](https://ollama.com) enables fully offline operation with no API keys required. PlanOpticon connects via Ollama's OpenAI-compatible API.
19
36
@@ -51,31 +68,43 @@
51
68
# Automatically discovers models from all configured providers + Ollama
52
69
```
53
70
54
71
## Routing preferences
55
72
56
-Each task type has a default preference order:
73
+Each task type has a default preference order (cheapest first):
Ollama acts as the last-resort fallback — if no cloud API keys are set but Ollama is running, it is used automatically.
65
66
## Manual override
67
68
```python
69
pm = ProviderManager(
70
vision_model="gpt-4o",
71
chat_model="claude-sonnet-4-5-20250929",
72
provider="openai", # Force a specific provider
73
)
74
75
# Or use Ollama for fully offline processing
76
pm = ProviderManager(provider="ollama")
77
```
78
79
## BaseProvider interface
80
81
All providers implement:
82
--- docs/architecture/providers.md
+++ docs/architecture/providers.md
@@ -1,19 +1,36 @@
1
# Provider System
2
3
## Overview
4
5
PlanOpticon supports multiple AI providers through a unified abstraction layer. Default models favor cost-effective options (Haiku, GPT-4o-mini, Gemini Flash) for routine tasks, with more capable models available when needed.
| Ollama (local) | Any installed model | llava, moondream, etc. | — (use local Whisper) | `OLLAMA_HOST` |
20
21
## Default models
22
23
PlanOpticon defaults to cheap, fast models for cost efficiency:
24
25
| Task | Default model |
26
|------|--------------|
27
| Vision (diagrams) | Gemini Flash |
28
| Chat (analysis) | Claude Haiku |
29
| Transcription | Local Whisper (fallback: Whisper-1) |
30
31
Use `--vision-model` and `--chat-model` to override with more capable models when needed (e.g., `--chat-model claude-sonnet-4-20250514` for complex analysis).
32
33
## Ollama (offline mode)
34
35
[Ollama](https://ollama.com) enables fully offline operation with no API keys required. PlanOpticon connects via Ollama's OpenAI-compatible API.
36
@@ -51,31 +68,43 @@
68
# Automatically discovers models from all configured providers + Ollama
69
```
70
71
## Routing preferences
72
73
Each task type has a default preference order (cheapest first):
+Default models prioritize cost efficiency. For complex or high-stakes analysis, override with more capable models using `--chat-model` or `--vision-model`.
23
66
24
67
If no cloud API keys are configured, PlanOpticon automatically falls back to Ollama when a local server is running. This enables fully offline operation when paired with local Whisper for transcription.
25
68
26
69
Override with `--provider`, `--vision-model`, or `--chat-model` flags.
27
70
28
71
--- docs/getting-started/configuration.md
+++ docs/getting-started/configuration.md
@@ -1,27 +1,70 @@
1
# Configuration
2
3
## Environment variables
4
5
| Variable | Description |
6
|----------|-------------|
7
| `OPENAI_API_KEY` | OpenAI API key |
8
| `ANTHROPIC_API_KEY` | Anthropic API key |
9
| `GEMINI_API_KEY` | Google Gemini API key |
10
| `OLLAMA_HOST` | Ollama server URL (default: `http://localhost:11434`) |
11
| `GOOGLE_APPLICATION_CREDENTIALS` | Path to Google service account JSON (for Drive) |
12
| `CACHE_DIR` | Directory for API response caching |
13
14
## Provider routing
15
16
PlanOpticon auto-discovers available models and routes each task to the best option:
If no cloud API keys are configured, PlanOpticon automatically falls back to Ollama when a local server is running. This enables fully offline operation when paired with local Whisper for transcription.
25
26
Override with `--provider`, `--vision-model`, or `--chat-model` flags.
Default models prioritize cost efficiency. For complex or high-stakes analysis, override with more capable models using `--chat-model` or `--vision-model`.
66
67
If no cloud API keys are configured, PlanOpticon automatically falls back to Ollama when a local server is running. This enables fully offline operation when paired with local Whisper for transcription.
68
69
Override with `--provider`, `--vision-model`, or `--chat-model` flags.
-**AI-powered video analysis and knowledge extraction.**
3
+**AI-powered video analysis, knowledge extraction, and planning.**
4
4
5
-PlanOpticon processes video recordings into structured knowledge — transcripts, diagrams, action items, key points, and knowledge graphs. It auto-discovers available models across OpenAI, Anthropic, and Gemini, and produces rich multi-format output.
5
+PlanOpticon processes video recordings and documents into structured knowledge — transcripts, diagrams, action items, key points, and knowledge graphs. It connects to 20+ source platforms, auto-discovers available models across multiple AI providers, and produces rich multi-format output with an interactive companion REPL and planning agent.
6
6
7
7
---
8
8
9
9
## Features
10
10
11
-- **Multi-provider AI** — Automatically discovers and routes to the best available model across OpenAI, Anthropic, and Google Gemini
11
+- **Multi-provider AI** — Automatically discovers and routes to the best available model across OpenAI, Anthropic, Google Gemini, and more
12
+- **Planning agent** — Agentic analysis that adaptively adjusts depth, focus, and strategy based on content
13
+- **Companion REPL** — Interactive chat interface for exploring your knowledge base conversationally
14
+- **20+ source connectors** — Google Workspace, Microsoft 365, Zoom, Teams, Meet, Notion, GitHub, YouTube, Obsidian, Apple Notes, and more
15
+- **Document export** — Export knowledge to Markdown, Obsidian, Notion, and exchange formats
16
+- **OAuth authentication** — Built-in `planopticon auth` for Google, Dropbox, Zoom, Notion, GitHub, and Microsoft
12
17
- **Smart frame extraction** — Change detection for transitions + periodic capture (every 30s) for slow-evolving content like document scrolling
13
18
- **People frame filtering** — OpenCV face detection removes webcam/video conference frames, keeping only shared content (slides, documents, screen shares)
**AI-powered video analysis and knowledge extraction.**
4
5
PlanOpticon processes video recordings into structured knowledge — transcripts, diagrams, action items, key points, and knowledge graphs. It auto-discovers available models across OpenAI, Anthropic, and Gemini, and produces rich multi-format output.
6
7
---
8
9
## Features
10
11
- **Multi-provider AI** — Automatically discovers and routes to the best available model across OpenAI, Anthropic, and Google Gemini
12
- **Smart frame extraction** — Change detection for transitions + periodic capture (every 30s) for slow-evolving content like document scrolling
13
- **People frame filtering** — OpenCV face detection removes webcam/video conference frames, keeping only shared content (slides, documents, screen shares)
**AI-powered video analysis, knowledge extraction, and planning.**
4
5
PlanOpticon processes video recordings and documents into structured knowledge — transcripts, diagrams, action items, key points, and knowledge graphs. It connects to 20+ source platforms, auto-discovers available models across multiple AI providers, and produces rich multi-format output with an interactive companion REPL and planning agent.
6
7
---
8
9
## Features
10
11
- **Multi-provider AI** — Automatically discovers and routes to the best available model across OpenAI, Anthropic, Google Gemini, and more
12
- **Planning agent** — Agentic analysis that adaptively adjusts depth, focus, and strategy based on content
13
- **Companion REPL** — Interactive chat interface for exploring your knowledge base conversationally
14
- **20+ source connectors** — Google Workspace, Microsoft 365, Zoom, Teams, Meet, Notion, GitHub, YouTube, Obsidian, Apple Notes, and more
15
- **Document export** — Export knowledge to Markdown, Obsidian, Notion, and exchange formats
16
- **OAuth authentication** — Built-in `planopticon auth` for Google, Dropbox, Zoom, Notion, GitHub, and Microsoft
17
- **Smart frame extraction** — Change detection for transitions + periodic capture (every 30s) for slow-evolving content like document scrolling
18
- **People frame filtering** — OpenCV face detection removes webcam/video conference frames, keeping only shared content (slides, documents, screen shares)