Navegador

navegador / docs / api / ingestion.md
1
# Ingestion API
2
3
All ingesters accept a `GraphStore` instance and return an `IngestionResult` dataclass.
4
5
```python
6
from navegador.graph import GraphStore
7
from navegador.ingest import RepoIngester, KnowledgeIngester, WikiIngester, PlanopticonIngester
8
```
9
10
---
11
12
## IngestionResult
13
14
```python
15
@dataclass
16
class IngestionResult:
17
nodes_created: int
18
nodes_updated: int
19
edges_created: int
20
files_processed: int
21
errors: list[str]
22
duration_seconds: float
23
```
24
25
---
26
27
## RepoIngester
28
29
Parses a source tree and writes code layer nodes and edges.
30
31
```python
32
class RepoIngester:
33
def __init__(self, store: GraphStore) -> None: ...
34
35
def ingest(
36
self,
37
path: str | Path,
38
*,
39
clear: bool = False,
40
incremental: bool = False,
41
redact: bool = False,
42
monorepo: bool = False,
43
) -> IngestionResult: ...
44
45
def ingest_file(
46
self,
47
path: str | Path,
48
*,
49
redact: bool = False,
50
) -> IngestionResult: ...
51
```
52
53
### Usage
54
55
```python
56
store = GraphStore.sqlite(".navegador/navegador.db")
57
ingester = RepoIngester(store)
58
59
# full repo ingest
60
result = ingester.ingest("./src")
61
print(f"{result.nodes_created} nodes, {result.edges_created} edges")
62
63
# incremental ingest — only reprocesses files whose content hash has changed
64
result = ingester.ingest("./src", incremental=True)
65
66
# incremental: single file
67
result = ingester.ingest_file("./src/auth/service.py")
68
69
# wipe + rebuild
70
result = ingester.ingest("./src", clear=True)
71
72
# redact sensitive content (strips tokens, passwords, keys from string literals)
73
result = ingester.ingest("./src", redact=True)
74
75
# monorepo — traverse workspace sub-packages
76
result = ingester.ingest("./monorepo", monorepo=True)
77
```
78
79
### Supported languages
80
81
| Language | File extensions | Parser | Extra required |
82
|---|---|---|---|
83
| Python | `.py` | tree-sitter-python | — (included) |
84
| TypeScript | `.ts`, `.tsx` | tree-sitter-typescript | — (included) |
85
| JavaScript | `.js`, `.jsx` | tree-sitter-javascript | — (included) |
86
| Go | `.go` | tree-sitter-go | — (included) |
87
| Rust | `.rs` | tree-sitter-rust | — (included) |
88
| Java | `.java` | tree-sitter-java | — (included) |
89
| Kotlin | `.kt`, `.kts` | tree-sitter-kotlin | `navegador[languages]` |
90
| C# | `.cs` | tree-sitter-c-sharp | `navegador[languages]` |
91
| PHP | `.php` | tree-sitter-php | `navegador[languages]` |
92
| Ruby | `.rb` | tree-sitter-ruby | `navegador[languages]` |
93
| Swift | `.swift` | tree-sitter-swift | `navegador[languages]` |
94
| C | `.c`, `.h` | tree-sitter-c | `navegador[languages]` |
95
| C++ | `.cpp`, `.cc`, `.cxx`, `.hpp` | tree-sitter-cpp | `navegador[languages]` |
96
97
### Adding a new language parser
98
99
1. Install the tree-sitter grammar: `pip install tree-sitter-<lang>`
100
2. Subclass `navegador.ingest.base.LanguageParser`:
101
102
```python
103
from navegador.ingest.base import LanguageParser, ParseResult
104
105
class RubyParser(LanguageParser):
106
language = "ruby"
107
extensions = [".rb"]
108
109
def parse(self, source: str, file_path: str) -> ParseResult:
110
# use self.tree_sitter_language to build the tree
111
# return ParseResult with nodes and edges
112
...
113
```
114
115
3. Register in `navegador/ingest/registry.py`:
116
117
```python
118
from .ruby import RubyParser
119
PARSERS["ruby"] = RubyParser
120
```
121
122
`RepoIngester` dispatches to registered parsers by file extension.
123
124
### Framework enrichers
125
126
After parsing, `RepoIngester` runs framework-specific enrichers that annotate nodes with framework context. Enrichers are discovered automatically based on what frameworks are detected in the repo.
127
128
| Framework | What gets enriched |
129
|---|---|
130
| Django | Models, views, URL patterns, admin registrations |
131
| FastAPI | Route handlers, dependency injections, Pydantic schemas |
132
| React | Components, hooks, prop types |
133
| Express | Route handlers, middleware chains |
134
| React Native | Screens, navigators |
135
| Rails | Controllers, models, routes |
136
| Spring Boot | Beans, controllers, repositories |
137
| Laravel | Controllers, models, routes |
138
139
---
140
141
## KnowledgeIngester
142
143
Writes knowledge layer nodes. Wraps the `navegador add` commands programmatically.
144
145
```python
146
class KnowledgeIngester:
147
def __init__(self, store: GraphStore) -> None: ...
148
149
def add_concept(
150
self,
151
name: str,
152
*,
153
description: str = "",
154
domain: str = "",
155
status: str = "",
156
) -> str: ... # returns node ID
157
158
def add_rule(
159
self,
160
name: str,
161
*,
162
description: str = "",
163
domain: str = "",
164
severity: str = "info",
165
rationale: str = "",
166
) -> str: ...
167
168
def add_decision(
169
self,
170
name: str,
171
*,
172
description: str = "",
173
domain: str = "",
174
rationale: str = "",
175
alternatives: str = "",
176
date: str = "",
177
status: str = "proposed",
178
) -> str: ...
179
180
def add_person(
181
self,
182
name: str,
183
*,
184
email: str = "",
185
role: str = "",
186
team: str = "",
187
) -> str: ...
188
189
def add_domain(
190
self,
191
name: str,
192
*,
193
description: str = "",
194
) -> str: ...
195
196
def annotate(
197
self,
198
code_name: str,
199
*,
200
node_type: str = "Function",
201
concept: str = "",
202
rule: str = "",
203
) -> None: ...
204
```
205
206
### Usage
207
208
```python
209
store = GraphStore.sqlite(".navegador/navegador.db")
210
ingester = KnowledgeIngester(store)
211
212
ingester.add_domain("Payments", description="Payment processing and billing")
213
ingester.add_concept("Idempotency", domain="Payments",
214
description="Operations safe to retry without side effects")
215
ingester.add_rule("RequireIdempotencyKey",
216
domain="Payments", severity="critical",
217
rationale="Card networks retry on timeout")
218
ingester.annotate("process_payment", node_type="Function",
219
concept="Idempotency", rule="RequireIdempotencyKey")
220
```
221
222
---
223
224
## WikiIngester
225
226
Fetches GitHub wiki pages and writes `WikiPage` nodes.
227
228
```python
229
class WikiIngester:
230
def __init__(self, store: GraphStore) -> None: ...
231
232
def ingest_repo(
233
self,
234
repo: str,
235
*,
236
token: str = "",
237
use_api: bool = False,
238
) -> IngestionResult: ...
239
240
def ingest_dir(
241
self,
242
path: str | Path,
243
) -> IngestionResult: ...
244
```
245
246
### Usage
247
248
```python
249
import os
250
store = GraphStore.sqlite(".navegador/navegador.db")
251
ingester = WikiIngester(store)
252
253
# from GitHub API
254
result = ingester.ingest_repo("myorg/myrepo", token=os.environ["GITHUB_TOKEN"])
255
256
# from local clone
257
result = ingester.ingest_dir("./myrepo.wiki")
258
```
259
260
---
261
262
## PlanopticonIngester
263
264
Ingests Planopticon knowledge graph output into the knowledge layer.
265
266
```python
267
class PlanopticonIngester:
268
def __init__(self, store: GraphStore) -> None: ...
269
270
def ingest(
271
self,
272
path: str | Path,
273
*,
274
input_type: str = "auto",
275
source: str = "",
276
) -> IngestionResult: ...
277
278
def ingest_manifest(
279
self,
280
path: str | Path,
281
*,
282
source: str = "",
283
) -> IngestionResult: ...
284
285
def ingest_kg(
286
self,
287
path: str | Path,
288
*,
289
source: str = "",
290
) -> IngestionResult: ...
291
292
def ingest_interchange(
293
self,
294
path: str | Path,
295
*,
296
source: str = "",
297
) -> IngestionResult: ...
298
299
def ingest_batch(
300
self,
301
path: str | Path,
302
*,
303
source: str = "",
304
) -> IngestionResult: ...
305
```
306
307
`input_type` values: `"auto"`, `"manifest"`, `"kg"`, `"interchange"`, `"batch"`.
308
309
See [Planopticon guide](../guide/planopticon.md) for format details and entity mapping.
310
311
---
312
313
## Export and import
314
315
Navegador can export the full graph (or a subset) to JSONL for backup, migration, or sharing. The JSONL format is one JSON object per line, where each object is either a node or an edge.
316
317
```bash
318
navegador export > graph.jsonl
319
navegador export --nodes-only > nodes.jsonl
320
navegador import graph.jsonl
321
```
322
323
Python API:
324
325
```python
326
from navegador.graph import GraphStore
327
328
store = GraphStore.sqlite(".navegador/navegador.db")
329
330
# export
331
with open("graph.jsonl", "w") as f:
332
store.export_jsonl(f)
333
334
# import into a new store
335
new_store = GraphStore.sqlite(".navegador/new.db")
336
with open("graph.jsonl") as f:
337
new_store.import_jsonl(f)
338
```
339
340
---
341
342
## Schema migrations
343
344
When upgrading navegador, run `navegador migrate` before re-ingesting to apply schema changes (new node properties, new edge types, index updates):
345
346
```bash
347
navegador migrate
348
```
349
350
Migrations are idempotent — safe to run multiple times. The migration state is stored in the graph itself under a `_MigrationState` node.
351
352
Python API:
353
354
```python
355
from navegador.graph import GraphStore, migrate
356
357
store = GraphStore.sqlite(".navegador/navegador.db")
358
migrate(store) # applies any pending migrations
359
```
360

Keyboard Shortcuts

Open search /
Next entry (timeline) j
Previous entry (timeline) k
Open focused entry Enter
Show this help ?
Toggle theme Top nav button