1
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Document Ingestion
2
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
3
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Document ingestion lets you process files -- PDFs, Markdown, and plaintext -- into a knowledge graph. PlanOpticon extracts text from documents, chunks it into manageable pieces, runs LLM-powered entity and relationship extraction, and stores the results in a FalkorDB knowledge graph. This is the same knowledge graph format produced by video analysis, so you can combine video and document insights in a single graph.
4
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
5
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
## Supported formats
6
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
7
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| Extension | Processor | Description |
8
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|-----------|-----------|-------------|
9
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `.pdf` | `PdfProcessor` | Extracts text page by page using pymupdf or pdfplumber |
10
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `.md`, `.markdown` | `MarkdownProcessor` | Splits on headings into sections |
11
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `.txt`, `.text`, `.log`, `.csv` | `PlaintextProcessor` | Splits on paragraph boundaries |
12
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
13
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Additional formats can be added by implementing the `DocumentProcessor` base class and registering it (see [Extending with custom processors](#extending-with-custom-processors) below).
14
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
15
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
## CLI usage
16
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
17
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### `planopticon ingest`
18
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
19
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
20
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest INPUT_PATH [OPTIONS]
21
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
22
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
23
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
**Arguments:**
24
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
25
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| Argument | Description |
26
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|----------|-------------|
27
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `INPUT_PATH` | Path to a file or directory to ingest (must exist) |
28
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
29
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
**Options:**
30
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
31
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| Option | Short | Default | Description |
32
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|--------|-------|---------|-------------|
33
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `--output` | `-o` | Current directory | Output directory for the knowledge graph |
34
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `--db-path` | | None | Path to an existing `knowledge_graph.db` to merge into |
35
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `--recursive / --no-recursive` | `-r` | `--recursive` | Recurse into subdirectories (directory ingestion only) |
36
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `--provider` | `-p` | `auto` | LLM provider for entity extraction (`openai`, `anthropic`, `gemini`, `ollama`, `azure`, `together`, `fireworks`, `cerebras`, `xai`) |
37
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `--chat-model` | | None | Override the model used for LLM entity extraction |
38
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
39
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Single file ingestion
40
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
41
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Process a single document and create a new knowledge graph:
42
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
43
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
44
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest spec.md
45
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
46
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
47
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
This creates `knowledge_graph.db` and `knowledge_graph.json` in the current directory.
48
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
49
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Specify an output directory:
50
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
51
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
52
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest report.pdf -o ./results
53
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
54
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
55
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
This creates `./results/knowledge_graph.db` and `./results/knowledge_graph.json`.
56
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
57
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Directory ingestion
58
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
59
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Process all supported files in a directory:
60
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
61
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
62
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest ./docs/
63
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
64
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
65
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
By default, this recurses into subdirectories. To process only the top-level directory:
66
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
67
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
68
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest ./docs/ --no-recursive
69
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
70
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
71
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
PlanOpticon automatically filters for supported file extensions. Unsupported files are silently skipped.
72
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
73
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Merging into an existing knowledge graph
74
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
75
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
To add document content to an existing knowledge graph (e.g., one created from video analysis), use `--db-path`:
76
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
77
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
78
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# First, analyze a video
79
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon analyze meeting.mp4 -o ./results
80
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
81
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Then, ingest supplementary documents into the same graph
82
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest ./meeting-notes/ --db-path ./results/knowledge_graph.db
83
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
84
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
85
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
The ingested entities and relationships are merged with the existing graph. Duplicate entities are consolidated automatically by the knowledge graph engine.
86
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
87
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Choosing an LLM provider
88
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
89
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Entity and relationship extraction requires an LLM. By default, PlanOpticon auto-detects available providers based on your environment variables. You can override this:
90
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
91
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
92
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Use Anthropic for extraction
93
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest docs/ -p anthropic
94
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
95
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Use a specific model
96
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest docs/ -p openai --chat-model gpt-4o
97
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
98
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Use a local Ollama model
99
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest docs/ -p ollama --chat-model llama3
100
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
101
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
102
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Output
103
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
104
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
After ingestion, PlanOpticon prints a summary:
105
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
106
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
107
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Knowledge graph: ./knowledge_graph.db
108
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
spec.md: 12 chunks
109
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
architecture.md: 8 chunks
110
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
requirements.txt: 3 chunks
111
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
112
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Ingestion complete:
113
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Files processed: 3
114
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Total chunks: 23
115
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Entities extracted: 47
116
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Relationships: 31
117
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Knowledge graph: ./knowledge_graph.db
118
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
119
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
120
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Both `.db` (SQLite/FalkorDB) and `.json` formats are saved automatically.
121
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
122
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
## How each processor works
123
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
124
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### PDF processor
125
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
126
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
The `PdfProcessor` extracts text from PDF files on a per-page basis. It tries two extraction libraries in order:
127
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
128
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
1. **pymupdf** (preferred) -- Fast, reliable text extraction. Install with `pip install pymupdf`.
129
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
2. **pdfplumber** (fallback) -- Alternative extractor. Install with `pip install pdfplumber`.
130
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
131
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
If neither library is installed, the processor raises an `ImportError` with installation instructions.
132
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
133
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Each page becomes a separate `DocumentChunk` with:
134
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
135
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `text`: The extracted text content of the page
136
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `page`: The 1-based page number
137
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `metadata.extraction_method`: Which library was used (`pymupdf` or `pdfplumber`)
138
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
139
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
To install PDF support:
140
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
141
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
142
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
pip install 'planopticon[pdf]'
143
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# or
144
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
pip install pymupdf
145
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# or
146
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
pip install pdfplumber
147
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
148
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
149
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Markdown processor
150
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
151
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
The `MarkdownProcessor` splits Markdown files on heading boundaries (lines starting with `#` through `######`). Each heading and its content until the next heading becomes a separate chunk.
152
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
153
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
**Splitting behavior:**
154
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
155
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- If the file contains headings, each heading section becomes a chunk. The `section` field records the heading text.
156
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- Content before the first heading is captured as a `(preamble)` chunk.
157
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- If the file contains no headings, it falls back to paragraph-based chunking (same as plaintext).
158
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
159
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
For example, a file with this structure:
160
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
161
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```markdown
162
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Some intro text.
163
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
164
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Architecture
165
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
166
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
The system uses a microservices architecture...
167
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
168
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
## Components
169
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
170
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
There are three main components...
171
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
172
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Deployment
173
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
174
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Deployment is handled via...
175
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
176
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
177
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Produces four chunks: `(preamble)`, `Architecture`, `Components`, and `Deployment`.
178
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
179
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Plaintext processor
180
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
181
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
The `PlaintextProcessor` handles `.txt`, `.text`, `.log`, and `.csv` files. It splits text on paragraph boundaries (double newlines) and groups paragraphs into chunks with a configurable maximum size.
182
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
183
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
**Chunking parameters:**
184
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
185
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| Parameter | Default | Description |
186
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|-----------|---------|-------------|
187
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `max_chunk_size` | 2000 characters | Maximum size of each chunk |
188
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `overlap` | 200 characters | Number of characters from the end of one chunk to repeat at the start of the next |
189
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
190
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
The overlap ensures that entities or context that spans a paragraph boundary are not lost. Chunks are created by accumulating paragraphs until the next paragraph would exceed `max_chunk_size`, at which point the current chunk is flushed and a new one begins.
191
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
192
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
## The ingestion pipeline
193
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
194
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Document ingestion follows this pipeline:
195
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
196
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
197
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
File on disk
198
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|
199
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
v
200
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Processor selection (by file extension)
201
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|
202
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
v
203
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Text extraction (PDF pages / Markdown sections / plaintext paragraphs)
204
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|
205
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
v
206
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
DocumentChunk objects (text + metadata)
207
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|
208
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
v
209
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Source registration (provenance tracking in the KG)
210
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|
211
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
v
212
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
KG content addition (LLM entity/relationship extraction per chunk)
213
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|
214
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
v
215
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Knowledge graph storage (.db + .json)
216
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
217
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
218
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Step 1: Processor selection
219
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
220
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
PlanOpticon maintains a registry of processors keyed by file extension. When you call `ingest_file()`, it looks up the appropriate processor using `get_processor(path)`. If no processor is registered for the file extension, a `ValueError` is raised.
221
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
222
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Step 2: Text extraction
223
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
224
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
The selected processor reads the file and produces a list of `DocumentChunk` objects. Each chunk contains:
225
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
226
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| Field | Type | Description |
227
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
|-------|------|-------------|
228
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `text` | `str` | The extracted text content |
229
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `source_file` | `str` | Path to the source file |
230
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `chunk_index` | `int` | Sequential index of this chunk within the file |
231
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `page` | `Optional[int]` | Page number (PDF only, 1-based) |
232
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `section` | `Optional[str]` | Section heading (Markdown only) |
233
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
| `metadata` | `Dict[str, Any]` | Additional metadata (e.g., extraction method) |
234
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
235
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Step 3: Source registration
236
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
237
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Each ingested file is registered as a source in the knowledge graph with provenance metadata:
238
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
239
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `source_id`: A SHA-256 hash of the absolute file path (first 12 characters), unless you provide a custom ID
240
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `source_type`: Always `"document"`
241
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `title`: The file stem (filename without extension)
242
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `path`: The file path
243
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `mime_type`: Detected MIME type
244
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `ingested_at`: ISO-8601 timestamp
245
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `metadata`: Chunk count and file extension
246
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
247
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Step 4: Entity and relationship extraction
248
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
249
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Each chunk's text is passed to `knowledge_graph.add_content()`, which uses the configured LLM provider to extract entities and relationships. The content source is tagged with the document name and either the page number or section name:
250
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
251
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `document:report.pdf:page:3`
252
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `document:spec.md:section:Architecture`
253
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
- `document:notes.txt` (no page or section)
254
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
255
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Step 5: Storage
256
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
257
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
The knowledge graph is saved in both `.db` (SQLite-backed FalkorDB) and `.json` formats.
258
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
259
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
## Combining with video analysis
260
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
261
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
A common workflow is to analyze a video recording and then ingest related documents into the same knowledge graph:
262
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
263
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
264
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Step 1: Analyze the meeting recording
265
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon analyze meeting-recording.mp4 -o ./project-kg
266
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
267
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Step 2: Ingest the meeting agenda
268
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest agenda.md --db-path ./project-kg/knowledge_graph.db
269
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
270
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Step 3: Ingest the project spec
271
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest project-spec.pdf --db-path ./project-kg/knowledge_graph.db
272
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
273
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Step 4: Ingest a whole docs folder
274
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest ./reference-docs/ --db-path ./project-kg/knowledge_graph.db
275
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
276
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Step 5: Query the combined graph
277
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon query --db-path ./project-kg/knowledge_graph.db
278
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
279
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
280
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
The resulting knowledge graph contains entities and relationships from all sources -- video transcripts, meeting agendas, specs, and reference documents -- with full provenance tracking so you can trace any entity back to its source.
281
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
282
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
## Python API
283
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
284
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Ingesting a single file
285
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
286
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```python
287
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from pathlib import Path
288
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from video_processor.integrators.knowledge_graph import KnowledgeGraph
289
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from video_processor.processors.ingest import ingest_file
290
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
291
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
292
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
chunk_count = ingest_file(Path("document.pdf"), kg)
293
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
print(f"Processed {chunk_count} chunks")
294
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
295
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
kg.save(Path("knowledge_graph.db"))
296
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
297
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
298
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Ingesting a directory
299
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
300
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```python
301
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from pathlib import Path
302
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from video_processor.integrators.knowledge_graph import KnowledgeGraph
303
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from video_processor.processors.ingest import ingest_directory
304
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
305
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
kg = KnowledgeGraph(db_path=Path("knowledge_graph.db"))
306
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
results = ingest_directory(
307
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Path("./docs"),
308
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
kg,
309
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
recursive=True,
310
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
extensions=[".md", ".pdf"], # Optional: filter by extension
311
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
)
312
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
313
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
for filepath, chunks in results.items():
314
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
print(f" {filepath}: {chunks} chunks")
315
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
316
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
kg.save(Path("knowledge_graph.db"))
317
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
318
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
319
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Listing supported extensions
320
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
321
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```python
322
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from video_processor.processors.base import list_supported_extensions
323
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
324
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
extensions = list_supported_extensions()
325
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
print(extensions)
326
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# ['.csv', '.log', '.markdown', '.md', '.pdf', '.text', '.txt']
327
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
328
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
329
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Working with processors directly
330
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
331
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```python
332
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from pathlib import Path
333
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from video_processor.processors.base import get_processor
334
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
335
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
processor = get_processor(Path("report.pdf"))
336
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
if processor:
337
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
chunks = processor.process(Path("report.pdf"))
338
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
for chunk in chunks:
339
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
print(f"Page {chunk.page}: {chunk.text[:100]}...")
340
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
341
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
342
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
## Extending with custom processors
343
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
344
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
To add support for a new file format, implement the `DocumentProcessor` abstract class and register it:
345
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
346
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```python
347
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from pathlib import Path
348
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from typing import List
349
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from video_processor.processors.base import (
350
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
DocumentChunk,
351
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
DocumentProcessor,
352
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
register_processor,
353
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
)
354
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
355
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
356
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
class HtmlProcessor(DocumentProcessor):
357
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
supported_extensions = [".html", ".htm"]
358
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
359
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
def can_process(self, path: Path) -> bool:
360
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
return path.suffix.lower() in self.supported_extensions
361
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
362
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
def process(self, path: Path) -> List[DocumentChunk]:
363
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
from bs4 import BeautifulSoup
364
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
365
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
soup = BeautifulSoup(path.read_text(), "html.parser")
366
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
text = soup.get_text(separator="\n")
367
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
return [
368
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
DocumentChunk(
369
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
text=text,
370
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
source_file=str(path),
371
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
chunk_index=0,
372
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
)
373
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
]
374
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
375
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
376
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
register_processor(HtmlProcessor.supported_extensions, HtmlProcessor)
377
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
378
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
379
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
After registration, `planopticon ingest` will automatically handle `.html` and `.htm` files.
380
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
381
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
## Companion REPL
382
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
383
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Inside the interactive companion REPL, you can ingest files using the `/ingest` command:
384
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
385
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
386
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
> /ingest ./meeting-notes.md
387
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
Ingested meeting-notes.md: 5 chunks
388
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
389
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
390
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
This adds content to the currently loaded knowledge graph.
391
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
392
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
## Common workflows
393
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
394
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Build a project knowledge base from scratch
395
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
396
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
397
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Ingest all project docs
398
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest ./project-docs/ -o ./knowledge-base
399
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
400
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Query what was captured
401
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon query --db-path ./knowledge-base/knowledge_graph.db
402
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
403
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Export as an Obsidian vault
404
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon export obsidian ./knowledge-base/knowledge_graph.db -o ./vault
405
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
406
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
407
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Incrementally build a knowledge graph
408
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
409
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
410
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Start with initial docs
411
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest ./sprint-1-docs/ -o ./kg
412
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
413
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Add more docs over time
414
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest ./sprint-2-docs/ --db-path ./kg/knowledge_graph.db
415
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon ingest ./sprint-3-docs/ --db-path ./kg/knowledge_graph.db
416
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
417
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# The graph grows with each ingestion
418
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon query --db-path ./kg/knowledge_graph.db stats
419
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
420
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
421
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
### Ingest from Google Workspace or Microsoft 365
422
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
423
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
PlanOpticon provides integrated commands that fetch cloud documents and ingest them in one step:
424
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
425
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```bash
426
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Google Workspace
427
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon gws ingest --folder-id FOLDER_ID -o ./results
428
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
429
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
# Microsoft 365 / SharePoint
430
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
planopticon m365 ingest --web-url https://contoso.sharepoint.com/sites/proj \
431
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
--folder-url /sites/proj/Shared\ Documents
432
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
```
433
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
434
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!
These commands handle authentication, document download, text extraction, and knowledge graph creation automatically.
435
{ copied = false; pop = false }, 1000)" :class="copied && 'copied'">
Copy link Copied!