Update docs, README, and add interactive CLI menu
- Replace README with accurate feature list, install instructions, and output structure
- Delete stale README_new.md
- Add agent-analyze and auth commands to CLI reference docs
- Add periodic capture and face detection to configuration docs
- Remove "Coming soon" from cloud sources (already implemented)
- Update index features and copyright year
- Add interactive numbered menu when running planopticon with no args
-Comprehensive Video Analysis & Knowledge Extraction CLI
3
-PlanOpticon is an advanced AI-powered CLI tool that conducts thorough analysis of video content, extracting structured knowledge, diagrams, and actionable insights. Using state-of-the-art computer vision and natural language processing techniques, PlanOpticon transforms video assets into valuable, structured information.
4
-
5
-Core Features
6
-
7
-Complete Transcription: Full speech-to-text with speaker attribution and semantic segmentation
8
-Visual Element Extraction: Automated recognition and digitization of diagrams, charts, whiteboards, and visual aids
9
-Action Item Detection: Intelligent identification and prioritization of tasks, commitments, and follow-ups
10
-Knowledge Structure: Organization of extracted content into searchable, related concepts
11
-Plan Generation: Synthesis of extracted elements into cohesive action plans and summaries
12
-
13
-
14
-Technical Implementation
15
-PlanOpticon leverages cloud APIs and efficient processing pipelines to achieve comprehensive video analysis:
-├── transcript.json # Full transcription with timestamps and speakers
65
-├── key_points.md # Extracted main concepts and ideas
66
-├── diagrams/ # Extracted and digitized visual elements
67
-│ ├── diagram_001.svg
68
-│ └── whiteboard_001.svg
69
-├── action_items.json # Prioritized tasks and commitments
70
-└── knowledge_graph.json # Relationship map of concepts
71
-
72
-Development Guidelines
73
-When contributing to PlanOpticon, please adhere to these principles:
74
-Code Standards
75
-
76
-Follow PEP 8 style guidelines for all Python code
77
-Write comprehensive docstrings using NumPy/Google style
78
-Maintain test coverage above 80%
79
-Use type hints consistently throughout the codebase
80
-
81
-Architecture Considerations
82
-
83
-Optimize for cross-platform compatibility (macOS, Linux, Windows)
84
-Ensure ARM architecture support for cloud deployment and Apple Silicon
85
-Implement graceful degradation when GPU is unavailable
86
-Design modular components with clear interfaces
87
-
88
-
89
-System Requirements
90
-
91
-Python 3.9+
92
-8GB RAM minimum (16GB recommended)
93
-2GB disk space for models and dependencies
94
-CUDA-compatible GPU (optional, for accelerated processing)
95
-ARM64 or x86_64 architecture
96
-
97
-
98
-Implementation Strategy
99
-The core processing pipeline requires thoughtful implementation of several key systems:
100
-
101
-Frame extraction and analysis
102
-
103
-Implement selective sampling based on visual change detection
104
-Utilize region proposal networks for element identification
105
-
106
-
107
-Speech processing
108
-
109
-Apply time-domain speaker diarization
110
-Implement context-aware transcription with domain adaptation
111
-
112
-
113
-Visual element extraction
114
-
115
-Develop whiteboard/diagram detection with boundary recognition
116
-Implement reconstruction of visual elements into vector formats
117
-
118
-
119
-Knowledge integration
120
-
121
-Create hierarchical structure of extracted concepts
122
-Generate relationship mappings between identified elements
123
-
124
-
125
-Action item synthesis
126
-
127
-Apply intent recognition for commitment identification
128
-Implement priority scoring based on contextual importance
129
-
130
-
131
-
132
-Each component should be implemented as a separate module with clear interfaces, allowing for independent testing and optimization.
133
-
134
-Development Approach
135
-When implementing PlanOpticon, consider these architectural principles:
136
-
137
-Pipeline Architecture
138
-
139
-Design processing stages that can operate independently
140
-Implement data passing between stages using standardized formats
141
-Enable parallelization where appropriate
142
-Consider using Python's asyncio for I/O-bound operations
143
-
144
-
145
-Performance Optimization
146
-
147
-Implement batched processing for GPU acceleration
148
-Use memory mapping for large video files
149
-Consider JIT compilation for performance-critical sections
150
-Profile and optimize bottlenecks systematically
151
-
152
-
153
-Error Handling
154
-
155
-Implement comprehensive exception handling
156
-Design graceful degradation paths for each component
157
-Provide detailed logging for troubleshooting
158
-Consider retry mechanisms for transient failures
159
-
160
-
161
-Testing Strategy
162
-
163
-Create comprehensive unit tests for each module
164
-Implement integration tests for end-to-end pipeline
165
-Develop benchmark tests for performance evaluation
166
-Use property-based testing for complex components
167
-
168
-
169
-
170
-The implementation should maintain separation of concerns while ensuring efficient data flow between components. Consider using dependency injection patterns to improve testability and component isolation.
171
-
172
-License
173
-MIT License
174
-
175
-Contact
176
-For questions or contributions, please open an issue on GitHub or contact the maintainers at [email protected].
1
+# PlanOpticon
2
+
3
+**AI-powered video analysis and knowledge extraction.**
4
+
5
+PlanOpticon processes video recordings into structured knowledge — transcripts, diagrams, action items, key points, and knowledge graphs. It auto-discovers available models across OpenAI, Anthropic, and Gemini, and produces rich multi-format output.
6
+
7
+## Features
8
+
9
+- **Multi-provider AI** — Auto-discovers and routes to the best available model across OpenAI, Anthropic, and Google Gemini
10
+- **Smart frame extraction** — Change detection for transitions + periodic capture for slow-evolving content (document scrolling, screen shares)
11
+- **People frame filtering** — OpenCV face detection automatically removes webcam/video conference frames, keeping only shared content
12
+- **Diagram extraction** — Vision model classification detects flowcharts, architecture diagrams, charts, and whiteboards
13
+- **Knowledge graphs** — Extracts entities and relationships, builds and merges knowledge graphs across videos
14
+- **Action item detection** — Finds commitments, tasks, and follow-ups with assignees and deadlines
15
+- **Batch processing** — Process entire folders of videos with merged knowledge graphs and cross-referencing
+Full documentation at [planopticon.dev](https://planopticon.dev)
98
+
99
+## License
100
+
101
+MIT License — Copyright (c) 2026 CONFLICT LLC
177
102
178
103
DELETED README_new.md
--- README.md
+++ README.md
@@ -1,176 +1,101 @@
1
PlanOpticon
2
Comprehensive Video Analysis & Knowledge Extraction CLI
3
PlanOpticon is an advanced AI-powered CLI tool that conducts thorough analysis of video content, extracting structured knowledge, diagrams, and actionable insights. Using state-of-the-art computer vision and natural language processing techniques, PlanOpticon transforms video assets into valuable, structured information.
4
5
Core Features
6
7
Complete Transcription: Full speech-to-text with speaker attribution and semantic segmentation
8
Visual Element Extraction: Automated recognition and digitization of diagrams, charts, whiteboards, and visual aids
9
Action Item Detection: Intelligent identification and prioritization of tasks, commitments, and follow-ups
10
Knowledge Structure: Organization of extracted content into searchable, related concepts
11
Plan Generation: Synthesis of extracted elements into cohesive action plans and summaries
12
13
14
Technical Implementation
15
PlanOpticon leverages cloud APIs and efficient processing pipelines to achieve comprehensive video analysis:
16
Architecture Overview
17
```
18
Video Input → Frame Extraction → Cloud API Integration → Knowledge Integration → Structured Output
19
↓ ↓ ↓
20
Frame Selection API Request Management Result Processing
21
• Key Frame • Vision API Calls • Content Organization
22
• Scene Detection • Speech-to-Text API • Relationship Mapping
23
• Content Changes • LLM Analysis API • Mermaid Generation
24
```
25
Key Components
26
27
Cloud API integration for speech-to-text transcription
28
Vision API utilization for diagram and visual content detection
29
LLM-powered content analysis and summarization
30
Efficient prompt engineering for specialized content extraction
31
Knowledge integration system for relationship mapping and organization
├── transcript.json # Full transcription with timestamps and speakers
65
├── key_points.md # Extracted main concepts and ideas
66
├── diagrams/ # Extracted and digitized visual elements
67
│ ├── diagram_001.svg
68
│ └── whiteboard_001.svg
69
├── action_items.json # Prioritized tasks and commitments
70
└── knowledge_graph.json # Relationship map of concepts
71
72
Development Guidelines
73
When contributing to PlanOpticon, please adhere to these principles:
74
Code Standards
75
76
Follow PEP 8 style guidelines for all Python code
77
Write comprehensive docstrings using NumPy/Google style
78
Maintain test coverage above 80%
79
Use type hints consistently throughout the codebase
80
81
Architecture Considerations
82
83
Optimize for cross-platform compatibility (macOS, Linux, Windows)
84
Ensure ARM architecture support for cloud deployment and Apple Silicon
85
Implement graceful degradation when GPU is unavailable
86
Design modular components with clear interfaces
87
88
89
System Requirements
90
91
Python 3.9+
92
8GB RAM minimum (16GB recommended)
93
2GB disk space for models and dependencies
94
CUDA-compatible GPU (optional, for accelerated processing)
95
ARM64 or x86_64 architecture
96
97
98
Implementation Strategy
99
The core processing pipeline requires thoughtful implementation of several key systems:
100
101
Frame extraction and analysis
102
103
Implement selective sampling based on visual change detection
104
Utilize region proposal networks for element identification
105
106
107
Speech processing
108
109
Apply time-domain speaker diarization
110
Implement context-aware transcription with domain adaptation
111
112
113
Visual element extraction
114
115
Develop whiteboard/diagram detection with boundary recognition
116
Implement reconstruction of visual elements into vector formats
117
118
119
Knowledge integration
120
121
Create hierarchical structure of extracted concepts
122
Generate relationship mappings between identified elements
123
124
125
Action item synthesis
126
127
Apply intent recognition for commitment identification
128
Implement priority scoring based on contextual importance
129
130
131
132
Each component should be implemented as a separate module with clear interfaces, allowing for independent testing and optimization.
133
134
Development Approach
135
When implementing PlanOpticon, consider these architectural principles:
136
137
Pipeline Architecture
138
139
Design processing stages that can operate independently
140
Implement data passing between stages using standardized formats
141
Enable parallelization where appropriate
142
Consider using Python's asyncio for I/O-bound operations
143
144
145
Performance Optimization
146
147
Implement batched processing for GPU acceleration
148
Use memory mapping for large video files
149
Consider JIT compilation for performance-critical sections
150
Profile and optimize bottlenecks systematically
151
152
153
Error Handling
154
155
Implement comprehensive exception handling
156
Design graceful degradation paths for each component
157
Provide detailed logging for troubleshooting
158
Consider retry mechanisms for transient failures
159
160
161
Testing Strategy
162
163
Create comprehensive unit tests for each module
164
Implement integration tests for end-to-end pipeline
165
Develop benchmark tests for performance evaluation
166
Use property-based testing for complex components
167
168
169
170
The implementation should maintain separation of concerns while ensuring efficient data flow between components. Consider using dependency injection patterns to improve testability and component isolation.
171
172
License
173
MIT License
174
175
Contact
176
For questions or contributions, please open an issue on GitHub or contact the maintainers at [email protected].
177
178
ELETED README_new.md
--- README.md
+++ README.md
@@ -1,176 +1,101 @@
1
# PlanOpticon
2
3
**AI-powered video analysis and knowledge extraction.**
4
5
PlanOpticon processes video recordings into structured knowledge — transcripts, diagrams, action items, key points, and knowledge graphs. It auto-discovers available models across OpenAI, Anthropic, and Gemini, and produces rich multi-format output.
6
7
## Features
8
9
- **Multi-provider AI** — Auto-discovers and routes to the best available model across OpenAI, Anthropic, and Google Gemini
10
- **Smart frame extraction** — Change detection for transitions + periodic capture for slow-evolving content (document scrolling, screen shares)
11
- **People frame filtering** — OpenCV face detection automatically removes webcam/video conference frames, keeping only shared content
12
- **Diagram extraction** — Vision model classification detects flowcharts, architecture diagrams, charts, and whiteboards
13
- **Knowledge graphs** — Extracts entities and relationships, builds and merges knowledge graphs across videos
14
- **Action item detection** — Finds commitments, tasks, and follow-ups with assignees and deadlines
15
- **Batch processing** — Process entire folders of videos with merged knowledge graphs and cross-referencing
Full documentation at [planopticon.dev](https://planopticon.dev)
98
99
## License
100
101
MIT License — Copyright (c) 2026 CONFLICT LLC
102
103
ELETED README_new.md
DREADME_new.md
-149
--- a/README_new.md
+++ b/README_new.md
@@ -1,149 +0,0 @@
1
-# PlanOpticon
2
-
3
-Comprehensive Video Analysis & Knowledge Extraction CLI
4
-
5
-## Overview
6
-
7
-PlanOpticon is an advanced AI-powered CLI tool that conducts thorough analysis of video content, extracting structured knowledge, diagrams, and actionable insights. Using state-of-the-art computer vision and natural language processing techniques, PlanOpticon transforms video assets into valuable, structured information.
8
-
9
-## Core Features
10
-
11
-- **Complete Transcription**: Full speech-to-text with speaker attribution and semantic segmentation
12
-- **Visual Element Extraction**: Automated recognition and digitization of diagrams, charts, whiteboards, and visual aids
13
-- **Action Item Detection**: Intelligent identification and prioritization of tasks, commitments, and follow-ups
14
-- **Knowledge Structure**: Organization of extracted content into searchable, related concepts
15
-- **Plan Generation**: Synthesis of extracted elements into cohesive action plans and summaries
16
-
17
-## Installation
18
-
19
-### Prerequisites
20
-
21
-- Python 3.9+
22
-- FFmpeg (for audio/video processing)
23
-- API keys for cloud services (OpenAI, Google Cloud, etc.)
-│ ├── video_name.json # Full transcription with timestamps and speakers
83
-│ ├── video_name.txt # Plain text transcription
84
-│ └── video_name.srt # Subtitle format
85
-├── frames/ # Extracted key frames
86
-│ ├── frame_0001.jpg
87
-│ └── frame_0002.jpg
88
-├── audio/ # Extracted audio
89
-│ └── video_name.wav
90
-├── diagrams/ # Extracted and digitized visual elements
91
-│ ├── diagram_001.svg
92
-│ └── whiteboard_001.svg
93
-└── cache/ # API response cache
94
-```
95
-
96
-## Development
97
-
98
-### Architecture
99
-
100
-PlanOpticon follows a modular pipeline architecture:
101
-
102
-```
103
-video_processor/
104
-├── extractors/ # Video and audio extraction
105
-├── api/ # Cloud API integrations
106
-├── analyzers/ # Content analysis components
107
-├── integrators/ # Knowledge integration
108
-├── utils/ # Common utilities
109
-└── cli/ # Command-line interface
110
-```
111
-
112
-### Code Standards
113
-
114
-- Follow PEP 8 style guidelines for all Python code
115
-- Write comprehensive docstrings using NumPy style
116
-- Include type hints consistently throughout the codebase
117
-- Maintain test coverage for key components
118
-
119
-### Testing
120
-
121
-Run tests with pytest:
122
-
123
-```bash
124
-pytest
125
-```
126
-
127
-## System Requirements
128
-
129
-- Python 3.9+
130
-- 8GB RAM minimum (16GB recommended)
131
-- 2GB disk space for models and dependencies
132
-- CUDA-compatible GPU (optional, for accelerated processing)
133
-- ARM64 or x86_64 architecture
134
-
135
-## License
136
-
137
-MIT License
138
-
139
-## Roadmap
140
-
141
-See [work_plan.md](work_plan.md) for detailed development roadmap and milestones.
142
-
143
-## Contributing
144
-
145
-Contributions are welcome! Please feel free to submit a Pull Request.
146
-
147
-## Contact
148
-
149
-For questions or contributions, please open an issue on GitHub or contact the maintainers at [email protected].
--- a/README_new.md
+++ b/README_new.md
@@ -1,149 +0,0 @@
1
# PlanOpticon
2
3
Comprehensive Video Analysis & Knowledge Extraction CLI
4
5
## Overview
6
7
PlanOpticon is an advanced AI-powered CLI tool that conducts thorough analysis of video content, extracting structured knowledge, diagrams, and actionable insights. Using state-of-the-art computer vision and natural language processing techniques, PlanOpticon transforms video assets into valuable, structured information.
8
9
## Core Features
10
11
- **Complete Transcription**: Full speech-to-text with speaker attribution and semantic segmentation
12
- **Visual Element Extraction**: Automated recognition and digitization of diagrams, charts, whiteboards, and visual aids
13
- **Action Item Detection**: Intelligent identification and prioritization of tasks, commitments, and follow-ups
14
- **Knowledge Structure**: Organization of extracted content into searchable, related concepts
15
- **Plan Generation**: Synthesis of extracted elements into cohesive action plans and summaries
16
17
## Installation
18
19
### Prerequisites
20
21
- Python 3.9+
22
- FFmpeg (for audio/video processing)
23
- API keys for cloud services (OpenAI, Google Cloud, etc.)
-Lower `change-threshold` = more frames kept. Higher `sampling-rate` = more candidates.
44
+Lower `change-threshold` = more frames kept. Higher `sampling-rate` = more candidates. Periodic capture catches content that changes too slowly for change detection (e.g., scrolling through a document during a screen share).
45
+
46
+People/webcam frames are automatically filtered out using face detection — no configuration needed.
Lower `change-threshold` = more frames kept. Higher `sampling-rate` = more candidates. Periodic capture catches content that changes too slowly for change detection (e.g., scrolling through a document during a screen share).
45
46
People/webcam frames are automatically filtered out using face detection — no configuration needed.