|
f0106a3…
|
leo
|
1 |
# Providers API Reference |
|
f0106a3…
|
leo
|
2 |
|
|
f0106a3…
|
leo
|
3 |
::: video_processor.providers.base |
|
f0106a3…
|
leo
|
4 |
|
|
f0106a3…
|
leo
|
5 |
::: video_processor.providers.manager |
|
f0106a3…
|
leo
|
6 |
|
|
f0106a3…
|
leo
|
7 |
::: video_processor.providers.discovery |
|
3551b80…
|
noreply
|
8 |
|
|
3551b80…
|
noreply
|
9 |
--- |
|
3551b80…
|
noreply
|
10 |
|
|
3551b80…
|
noreply
|
11 |
## Overview |
|
3551b80…
|
noreply
|
12 |
|
|
3551b80…
|
noreply
|
13 |
The provider system abstracts LLM API calls behind a unified interface. It supports multiple providers (OpenAI, Anthropic, Gemini, Ollama, and OpenAI-compatible services), automatic model discovery, capability-based routing, and usage tracking. |
|
3551b80…
|
noreply
|
14 |
|
|
3551b80…
|
noreply
|
15 |
**Key components:** |
|
3551b80…
|
noreply
|
16 |
|
|
3551b80…
|
noreply
|
17 |
- **`BaseProvider`** -- abstract interface that all providers implement |
|
3551b80…
|
noreply
|
18 |
- **`ProviderRegistry`** -- global registry mapping provider names to classes |
|
3551b80…
|
noreply
|
19 |
- **`ProviderManager`** -- high-level router that picks the best provider for each task |
|
3551b80…
|
noreply
|
20 |
- **`discover_available_models()`** -- scans all configured providers for available models |
|
3551b80…
|
noreply
|
21 |
|
|
3551b80…
|
noreply
|
22 |
--- |
|
3551b80…
|
noreply
|
23 |
|
|
3551b80…
|
noreply
|
24 |
## BaseProvider (ABC) |
|
3551b80…
|
noreply
|
25 |
|
|
3551b80…
|
noreply
|
26 |
```python |
|
3551b80…
|
noreply
|
27 |
from video_processor.providers.base import BaseProvider |
|
3551b80…
|
noreply
|
28 |
``` |
|
3551b80…
|
noreply
|
29 |
|
|
3551b80…
|
noreply
|
30 |
Abstract base class that all provider implementations must subclass. Defines the four core capabilities: chat, vision, audio transcription, and model listing. |
|
3551b80…
|
noreply
|
31 |
|
|
3551b80…
|
noreply
|
32 |
**Class attribute:** |
|
3551b80…
|
noreply
|
33 |
|
|
3551b80…
|
noreply
|
34 |
| Attribute | Type | Description | |
|
3551b80…
|
noreply
|
35 |
|---|---|---| |
|
3551b80…
|
noreply
|
36 |
| `provider_name` | `str` | Identifier for this provider (e.g., `"openai"`, `"anthropic"`) | |
|
3551b80…
|
noreply
|
37 |
|
|
3551b80…
|
noreply
|
38 |
### chat() |
|
3551b80…
|
noreply
|
39 |
|
|
3551b80…
|
noreply
|
40 |
```python |
|
3551b80…
|
noreply
|
41 |
def chat( |
|
3551b80…
|
noreply
|
42 |
self, |
|
3551b80…
|
noreply
|
43 |
messages: list[dict], |
|
3551b80…
|
noreply
|
44 |
max_tokens: int = 4096, |
|
3551b80…
|
noreply
|
45 |
temperature: float = 0.7, |
|
3551b80…
|
noreply
|
46 |
model: Optional[str] = None, |
|
3551b80…
|
noreply
|
47 |
) -> str |
|
3551b80…
|
noreply
|
48 |
``` |
|
3551b80…
|
noreply
|
49 |
|
|
3551b80…
|
noreply
|
50 |
Send a chat completion request. |
|
3551b80…
|
noreply
|
51 |
|
|
3551b80…
|
noreply
|
52 |
**Parameters:** |
|
3551b80…
|
noreply
|
53 |
|
|
3551b80…
|
noreply
|
54 |
| Parameter | Type | Default | Description | |
|
3551b80…
|
noreply
|
55 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
56 |
| `messages` | `list[dict]` | *required* | OpenAI-format message list (`role`, `content`) | |
|
3551b80…
|
noreply
|
57 |
| `max_tokens` | `int` | `4096` | Maximum tokens in the response | |
|
3551b80…
|
noreply
|
58 |
| `temperature` | `float` | `0.7` | Sampling temperature | |
|
3551b80…
|
noreply
|
59 |
| `model` | `Optional[str]` | `None` | Override model ID | |
|
3551b80…
|
noreply
|
60 |
|
|
3551b80…
|
noreply
|
61 |
**Returns:** `str` -- the assistant's text response. |
|
3551b80…
|
noreply
|
62 |
|
|
3551b80…
|
noreply
|
63 |
### analyze_image() |
|
3551b80…
|
noreply
|
64 |
|
|
3551b80…
|
noreply
|
65 |
```python |
|
3551b80…
|
noreply
|
66 |
def analyze_image( |
|
3551b80…
|
noreply
|
67 |
self, |
|
3551b80…
|
noreply
|
68 |
image_bytes: bytes, |
|
3551b80…
|
noreply
|
69 |
prompt: str, |
|
3551b80…
|
noreply
|
70 |
max_tokens: int = 4096, |
|
3551b80…
|
noreply
|
71 |
model: Optional[str] = None, |
|
3551b80…
|
noreply
|
72 |
) -> str |
|
3551b80…
|
noreply
|
73 |
``` |
|
3551b80…
|
noreply
|
74 |
|
|
3551b80…
|
noreply
|
75 |
Analyze an image with a text prompt using a vision-capable model. |
|
3551b80…
|
noreply
|
76 |
|
|
3551b80…
|
noreply
|
77 |
**Parameters:** |
|
3551b80…
|
noreply
|
78 |
|
|
3551b80…
|
noreply
|
79 |
| Parameter | Type | Default | Description | |
|
3551b80…
|
noreply
|
80 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
81 |
| `image_bytes` | `bytes` | *required* | Raw image data (JPEG, PNG, etc.) | |
|
3551b80…
|
noreply
|
82 |
| `prompt` | `str` | *required* | Analysis instructions | |
|
3551b80…
|
noreply
|
83 |
| `max_tokens` | `int` | `4096` | Maximum tokens in the response | |
|
3551b80…
|
noreply
|
84 |
| `model` | `Optional[str]` | `None` | Override model ID | |
|
3551b80…
|
noreply
|
85 |
|
|
3551b80…
|
noreply
|
86 |
**Returns:** `str` -- the assistant's analysis text. |
|
3551b80…
|
noreply
|
87 |
|
|
3551b80…
|
noreply
|
88 |
### transcribe_audio() |
|
3551b80…
|
noreply
|
89 |
|
|
3551b80…
|
noreply
|
90 |
```python |
|
3551b80…
|
noreply
|
91 |
def transcribe_audio( |
|
3551b80…
|
noreply
|
92 |
self, |
|
3551b80…
|
noreply
|
93 |
audio_path: str | Path, |
|
3551b80…
|
noreply
|
94 |
language: Optional[str] = None, |
|
3551b80…
|
noreply
|
95 |
model: Optional[str] = None, |
|
3551b80…
|
noreply
|
96 |
) -> dict |
|
3551b80…
|
noreply
|
97 |
``` |
|
3551b80…
|
noreply
|
98 |
|
|
3551b80…
|
noreply
|
99 |
Transcribe an audio file. |
|
3551b80…
|
noreply
|
100 |
|
|
3551b80…
|
noreply
|
101 |
**Parameters:** |
|
3551b80…
|
noreply
|
102 |
|
|
3551b80…
|
noreply
|
103 |
| Parameter | Type | Default | Description | |
|
3551b80…
|
noreply
|
104 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
105 |
| `audio_path` | `str \| Path` | *required* | Path to the audio file | |
|
3551b80…
|
noreply
|
106 |
| `language` | `Optional[str]` | `None` | Language hint (ISO 639-1 code) | |
|
3551b80…
|
noreply
|
107 |
| `model` | `Optional[str]` | `None` | Override model ID | |
|
3551b80…
|
noreply
|
108 |
|
|
3551b80…
|
noreply
|
109 |
**Returns:** `dict` -- transcription result with keys `text`, `segments`, `duration`, etc. |
|
3551b80…
|
noreply
|
110 |
|
|
3551b80…
|
noreply
|
111 |
### list_models() |
|
3551b80…
|
noreply
|
112 |
|
|
3551b80…
|
noreply
|
113 |
```python |
|
3551b80…
|
noreply
|
114 |
def list_models(self) -> list[ModelInfo] |
|
3551b80…
|
noreply
|
115 |
``` |
|
3551b80…
|
noreply
|
116 |
|
|
3551b80…
|
noreply
|
117 |
Discover available models from this provider's API. |
|
3551b80…
|
noreply
|
118 |
|
|
3551b80…
|
noreply
|
119 |
**Returns:** `list[ModelInfo]` -- available models with capability metadata. |
|
3551b80…
|
noreply
|
120 |
|
|
3551b80…
|
noreply
|
121 |
--- |
|
3551b80…
|
noreply
|
122 |
|
|
3551b80…
|
noreply
|
123 |
## ModelInfo |
|
3551b80…
|
noreply
|
124 |
|
|
3551b80…
|
noreply
|
125 |
```python |
|
3551b80…
|
noreply
|
126 |
from video_processor.providers.base import ModelInfo |
|
3551b80…
|
noreply
|
127 |
``` |
|
3551b80…
|
noreply
|
128 |
|
|
3551b80…
|
noreply
|
129 |
Pydantic model describing an available model from a provider. |
|
3551b80…
|
noreply
|
130 |
|
|
3551b80…
|
noreply
|
131 |
| Field | Type | Default | Description | |
|
3551b80…
|
noreply
|
132 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
133 |
| `id` | `str` | *required* | Model identifier (e.g., `"gpt-4o"`, `"claude-haiku-4-5-20251001"`) | |
|
3551b80…
|
noreply
|
134 |
| `provider` | `str` | *required* | Provider name (e.g., `"openai"`, `"anthropic"`, `"gemini"`) | |
|
3551b80…
|
noreply
|
135 |
| `display_name` | `str` | `""` | Human-readable display name | |
|
3551b80…
|
noreply
|
136 |
| `capabilities` | `List[str]` | `[]` | Model capabilities: `"chat"`, `"vision"`, `"audio"`, `"embedding"` | |
|
3551b80…
|
noreply
|
137 |
|
|
3551b80…
|
noreply
|
138 |
```json |
|
3551b80…
|
noreply
|
139 |
{ |
|
3551b80…
|
noreply
|
140 |
"id": "gpt-4o", |
|
3551b80…
|
noreply
|
141 |
"provider": "openai", |
|
3551b80…
|
noreply
|
142 |
"display_name": "GPT-4o", |
|
3551b80…
|
noreply
|
143 |
"capabilities": ["chat", "vision"] |
|
3551b80…
|
noreply
|
144 |
} |
|
3551b80…
|
noreply
|
145 |
``` |
|
3551b80…
|
noreply
|
146 |
|
|
3551b80…
|
noreply
|
147 |
--- |
|
3551b80…
|
noreply
|
148 |
|
|
3551b80…
|
noreply
|
149 |
## ProviderRegistry |
|
3551b80…
|
noreply
|
150 |
|
|
3551b80…
|
noreply
|
151 |
```python |
|
3551b80…
|
noreply
|
152 |
from video_processor.providers.base import ProviderRegistry |
|
3551b80…
|
noreply
|
153 |
``` |
|
3551b80…
|
noreply
|
154 |
|
|
3551b80…
|
noreply
|
155 |
Class-level registry for provider classes. Providers register themselves with metadata on import. This registry is used internally by `ProviderManager` but can also be used directly for introspection. |
|
3551b80…
|
noreply
|
156 |
|
|
3551b80…
|
noreply
|
157 |
### register() |
|
3551b80…
|
noreply
|
158 |
|
|
3551b80…
|
noreply
|
159 |
```python |
|
3551b80…
|
noreply
|
160 |
@classmethod |
|
3551b80…
|
noreply
|
161 |
def register( |
|
3551b80…
|
noreply
|
162 |
cls, |
|
3551b80…
|
noreply
|
163 |
name: str, |
|
3551b80…
|
noreply
|
164 |
provider_class: type, |
|
3551b80…
|
noreply
|
165 |
env_var: str = "", |
|
3551b80…
|
noreply
|
166 |
model_prefixes: Optional[List[str]] = None, |
|
3551b80…
|
noreply
|
167 |
default_models: Optional[Dict[str, str]] = None, |
|
3551b80…
|
noreply
|
168 |
) -> None |
|
3551b80…
|
noreply
|
169 |
``` |
|
3551b80…
|
noreply
|
170 |
|
|
3551b80…
|
noreply
|
171 |
Register a provider class with its metadata. Called by each provider module at import time. |
|
3551b80…
|
noreply
|
172 |
|
|
3551b80…
|
noreply
|
173 |
**Parameters:** |
|
3551b80…
|
noreply
|
174 |
|
|
3551b80…
|
noreply
|
175 |
| Parameter | Type | Default | Description | |
|
3551b80…
|
noreply
|
176 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
177 |
| `name` | `str` | *required* | Provider name (e.g., `"openai"`) | |
|
3551b80…
|
noreply
|
178 |
| `provider_class` | `type` | *required* | The provider class | |
|
3551b80…
|
noreply
|
179 |
| `env_var` | `str` | `""` | Environment variable for API key | |
|
3551b80…
|
noreply
|
180 |
| `model_prefixes` | `Optional[List[str]]` | `None` | Model ID prefixes for auto-detection (e.g., `["gpt-", "o1-"]`) | |
|
3551b80…
|
noreply
|
181 |
| `default_models` | `Optional[Dict[str, str]]` | `None` | Default models per capability (e.g., `{"chat": "gpt-4o", "vision": "gpt-4o"}`) | |
|
3551b80…
|
noreply
|
182 |
|
|
3551b80…
|
noreply
|
183 |
### get() |
|
3551b80…
|
noreply
|
184 |
|
|
3551b80…
|
noreply
|
185 |
```python |
|
3551b80…
|
noreply
|
186 |
@classmethod |
|
3551b80…
|
noreply
|
187 |
def get(cls, name: str) -> type |
|
3551b80…
|
noreply
|
188 |
``` |
|
3551b80…
|
noreply
|
189 |
|
|
3551b80…
|
noreply
|
190 |
Return the provider class for a given name. Raises `ValueError` if the provider is not registered. |
|
3551b80…
|
noreply
|
191 |
|
|
3551b80…
|
noreply
|
192 |
### get_by_model() |
|
3551b80…
|
noreply
|
193 |
|
|
3551b80…
|
noreply
|
194 |
```python |
|
3551b80…
|
noreply
|
195 |
@classmethod |
|
3551b80…
|
noreply
|
196 |
def get_by_model(cls, model_id: str) -> Optional[str] |
|
3551b80…
|
noreply
|
197 |
``` |
|
3551b80…
|
noreply
|
198 |
|
|
3551b80…
|
noreply
|
199 |
Return the provider name for a model ID based on prefix matching. Returns `None` if no match is found. |
|
3551b80…
|
noreply
|
200 |
|
|
3551b80…
|
noreply
|
201 |
### get_default_models() |
|
3551b80…
|
noreply
|
202 |
|
|
3551b80…
|
noreply
|
203 |
```python |
|
3551b80…
|
noreply
|
204 |
@classmethod |
|
3551b80…
|
noreply
|
205 |
def get_default_models(cls, name: str) -> Dict[str, str] |
|
3551b80…
|
noreply
|
206 |
``` |
|
3551b80…
|
noreply
|
207 |
|
|
3551b80…
|
noreply
|
208 |
Return the default models dict for a provider, mapping capability names to model IDs. |
|
3551b80…
|
noreply
|
209 |
|
|
3551b80…
|
noreply
|
210 |
### available() |
|
3551b80…
|
noreply
|
211 |
|
|
3551b80…
|
noreply
|
212 |
```python |
|
3551b80…
|
noreply
|
213 |
@classmethod |
|
3551b80…
|
noreply
|
214 |
def available(cls) -> List[str] |
|
3551b80…
|
noreply
|
215 |
``` |
|
3551b80…
|
noreply
|
216 |
|
|
3551b80…
|
noreply
|
217 |
Return names of providers whose required environment variable is set (or providers with no env var requirement, like Ollama). |
|
3551b80…
|
noreply
|
218 |
|
|
3551b80…
|
noreply
|
219 |
### all_registered() |
|
3551b80…
|
noreply
|
220 |
|
|
3551b80…
|
noreply
|
221 |
```python |
|
3551b80…
|
noreply
|
222 |
@classmethod |
|
3551b80…
|
noreply
|
223 |
def all_registered(cls) -> Dict[str, Dict] |
|
3551b80…
|
noreply
|
224 |
``` |
|
3551b80…
|
noreply
|
225 |
|
|
3551b80…
|
noreply
|
226 |
Return all registered providers and their metadata dictionaries. |
|
3551b80…
|
noreply
|
227 |
|
|
3551b80…
|
noreply
|
228 |
--- |
|
3551b80…
|
noreply
|
229 |
|
|
3551b80…
|
noreply
|
230 |
## OpenAICompatibleProvider |
|
3551b80…
|
noreply
|
231 |
|
|
3551b80…
|
noreply
|
232 |
```python |
|
3551b80…
|
noreply
|
233 |
from video_processor.providers.base import OpenAICompatibleProvider |
|
3551b80…
|
noreply
|
234 |
``` |
|
3551b80…
|
noreply
|
235 |
|
|
3551b80…
|
noreply
|
236 |
Base class for providers using OpenAI-compatible APIs (Together, Fireworks, Cerebras, xAI, Azure). Implements `chat()`, `analyze_image()`, and `list_models()` using the OpenAI client library. `transcribe_audio()` raises `NotImplementedError` by default. |
|
3551b80…
|
noreply
|
237 |
|
|
3551b80…
|
noreply
|
238 |
**Constructor:** |
|
3551b80…
|
noreply
|
239 |
|
|
3551b80…
|
noreply
|
240 |
```python |
|
3551b80…
|
noreply
|
241 |
def __init__(self, api_key: Optional[str] = None, base_url: Optional[str] = None) |
|
3551b80…
|
noreply
|
242 |
``` |
|
3551b80…
|
noreply
|
243 |
|
|
3551b80…
|
noreply
|
244 |
| Parameter | Type | Default | Description | |
|
3551b80…
|
noreply
|
245 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
246 |
| `api_key` | `Optional[str]` | `None` | API key (falls back to `self.env_var` environment variable) | |
|
3551b80…
|
noreply
|
247 |
| `base_url` | `Optional[str]` | `None` | API base URL (falls back to `self.base_url` class attribute) | |
|
3551b80…
|
noreply
|
248 |
|
|
3551b80…
|
noreply
|
249 |
**Subclass attributes to override:** |
|
3551b80…
|
noreply
|
250 |
|
|
3551b80…
|
noreply
|
251 |
| Attribute | Description | |
|
3551b80…
|
noreply
|
252 |
|---|---| |
|
3551b80…
|
noreply
|
253 |
| `provider_name` | Provider identifier string | |
|
3551b80…
|
noreply
|
254 |
| `base_url` | Default API base URL | |
|
3551b80…
|
noreply
|
255 |
| `env_var` | Environment variable name for the API key | |
|
3551b80…
|
noreply
|
256 |
|
|
3551b80…
|
noreply
|
257 |
**Usage tracking:** After each `chat()` or `analyze_image()` call, the provider stores token counts in `self._last_usage` as `{"input_tokens": int, "output_tokens": int}`. This is consumed by `ProviderManager._track()`. |
|
3551b80…
|
noreply
|
258 |
|
|
3551b80…
|
noreply
|
259 |
--- |
|
3551b80…
|
noreply
|
260 |
|
|
3551b80…
|
noreply
|
261 |
## ProviderManager |
|
3551b80…
|
noreply
|
262 |
|
|
3551b80…
|
noreply
|
263 |
```python |
|
3551b80…
|
noreply
|
264 |
from video_processor.providers.manager import ProviderManager |
|
3551b80…
|
noreply
|
265 |
``` |
|
3551b80…
|
noreply
|
266 |
|
|
3551b80…
|
noreply
|
267 |
High-level router that selects the best available provider and model for each API call. Supports explicit model selection, forced provider, or automatic selection based on discovered capabilities. |
|
3551b80…
|
noreply
|
268 |
|
|
3551b80…
|
noreply
|
269 |
### Constructor |
|
3551b80…
|
noreply
|
270 |
|
|
3551b80…
|
noreply
|
271 |
```python |
|
3551b80…
|
noreply
|
272 |
def __init__( |
|
3551b80…
|
noreply
|
273 |
self, |
|
3551b80…
|
noreply
|
274 |
vision_model: Optional[str] = None, |
|
3551b80…
|
noreply
|
275 |
chat_model: Optional[str] = None, |
|
3551b80…
|
noreply
|
276 |
transcription_model: Optional[str] = None, |
|
3551b80…
|
noreply
|
277 |
provider: Optional[str] = None, |
|
3551b80…
|
noreply
|
278 |
auto: bool = True, |
|
3551b80…
|
noreply
|
279 |
) |
|
3551b80…
|
noreply
|
280 |
``` |
|
3551b80…
|
noreply
|
281 |
|
|
3551b80…
|
noreply
|
282 |
| Parameter | Type | Default | Description | |
|
3551b80…
|
noreply
|
283 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
284 |
| `vision_model` | `Optional[str]` | `None` | Override model for vision tasks (e.g., `"gpt-4o"`) | |
|
3551b80…
|
noreply
|
285 |
| `chat_model` | `Optional[str]` | `None` | Override model for chat/LLM tasks | |
|
3551b80…
|
noreply
|
286 |
| `transcription_model` | `Optional[str]` | `None` | Override model for transcription | |
|
3551b80…
|
noreply
|
287 |
| `provider` | `Optional[str]` | `None` | Force all tasks to a single provider | |
|
3551b80…
|
noreply
|
288 |
| `auto` | `bool` | `True` | If `True` and no model specified, pick the best available | |
|
3551b80…
|
noreply
|
289 |
|
|
3551b80…
|
noreply
|
290 |
**Attributes:** |
|
3551b80…
|
noreply
|
291 |
|
|
3551b80…
|
noreply
|
292 |
| Attribute | Type | Description | |
|
3551b80…
|
noreply
|
293 |
|---|---|---| |
|
3551b80…
|
noreply
|
294 |
| `usage` | `UsageTracker` | Tracks token counts and API costs across all calls | |
|
3551b80…
|
noreply
|
295 |
|
|
3551b80…
|
noreply
|
296 |
### Auto-selection preferences |
|
3551b80…
|
noreply
|
297 |
|
|
3551b80…
|
noreply
|
298 |
When `auto=True` and no explicit model is set, providers are tried in this order: |
|
3551b80…
|
noreply
|
299 |
|
|
3551b80…
|
noreply
|
300 |
**Vision:** Gemini (`gemini-2.5-flash`) > OpenAI (`gpt-4o-mini`) > Anthropic (`claude-haiku-4-5-20251001`) |
|
3551b80…
|
noreply
|
301 |
|
|
3551b80…
|
noreply
|
302 |
**Chat:** Anthropic (`claude-haiku-4-5-20251001`) > OpenAI (`gpt-4o-mini`) > Gemini (`gemini-2.5-flash`) |
|
3551b80…
|
noreply
|
303 |
|
|
3551b80…
|
noreply
|
304 |
**Transcription:** OpenAI (`whisper-1`) > Gemini (`gemini-2.5-flash`) |
|
3551b80…
|
noreply
|
305 |
|
|
3551b80…
|
noreply
|
306 |
If no API-key-based provider is available, Ollama is tried as a fallback. |
|
3551b80…
|
noreply
|
307 |
|
|
3551b80…
|
noreply
|
308 |
### chat() |
|
3551b80…
|
noreply
|
309 |
|
|
3551b80…
|
noreply
|
310 |
```python |
|
3551b80…
|
noreply
|
311 |
def chat( |
|
3551b80…
|
noreply
|
312 |
self, |
|
3551b80…
|
noreply
|
313 |
messages: list[dict], |
|
3551b80…
|
noreply
|
314 |
max_tokens: int = 4096, |
|
3551b80…
|
noreply
|
315 |
temperature: float = 0.7, |
|
3551b80…
|
noreply
|
316 |
) -> str |
|
3551b80…
|
noreply
|
317 |
``` |
|
3551b80…
|
noreply
|
318 |
|
|
3551b80…
|
noreply
|
319 |
Send a chat completion to the best available provider. Automatically resolves which provider and model to use. |
|
3551b80…
|
noreply
|
320 |
|
|
3551b80…
|
noreply
|
321 |
**Parameters:** |
|
3551b80…
|
noreply
|
322 |
|
|
3551b80…
|
noreply
|
323 |
| Parameter | Type | Default | Description | |
|
3551b80…
|
noreply
|
324 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
325 |
| `messages` | `list[dict]` | *required* | OpenAI-format messages | |
|
3551b80…
|
noreply
|
326 |
| `max_tokens` | `int` | `4096` | Maximum response tokens | |
|
3551b80…
|
noreply
|
327 |
| `temperature` | `float` | `0.7` | Sampling temperature | |
|
3551b80…
|
noreply
|
328 |
|
|
3551b80…
|
noreply
|
329 |
**Returns:** `str` -- assistant response text. |
|
3551b80…
|
noreply
|
330 |
|
|
3551b80…
|
noreply
|
331 |
**Raises:** `RuntimeError` if no provider is available for the `chat` capability. |
|
3551b80…
|
noreply
|
332 |
|
|
3551b80…
|
noreply
|
333 |
### analyze_image() |
|
3551b80…
|
noreply
|
334 |
|
|
3551b80…
|
noreply
|
335 |
```python |
|
3551b80…
|
noreply
|
336 |
def analyze_image( |
|
3551b80…
|
noreply
|
337 |
self, |
|
3551b80…
|
noreply
|
338 |
image_bytes: bytes, |
|
3551b80…
|
noreply
|
339 |
prompt: str, |
|
3551b80…
|
noreply
|
340 |
max_tokens: int = 4096, |
|
3551b80…
|
noreply
|
341 |
) -> str |
|
3551b80…
|
noreply
|
342 |
``` |
|
3551b80…
|
noreply
|
343 |
|
|
3551b80…
|
noreply
|
344 |
Analyze an image using the best available vision provider. |
|
3551b80…
|
noreply
|
345 |
|
|
3551b80…
|
noreply
|
346 |
**Returns:** `str` -- analysis text. |
|
3551b80…
|
noreply
|
347 |
|
|
3551b80…
|
noreply
|
348 |
**Raises:** `RuntimeError` if no provider is available for the `vision` capability. |
|
3551b80…
|
noreply
|
349 |
|
|
3551b80…
|
noreply
|
350 |
### transcribe_audio() |
|
3551b80…
|
noreply
|
351 |
|
|
3551b80…
|
noreply
|
352 |
```python |
|
3551b80…
|
noreply
|
353 |
def transcribe_audio( |
|
3551b80…
|
noreply
|
354 |
self, |
|
3551b80…
|
noreply
|
355 |
audio_path: str | Path, |
|
3551b80…
|
noreply
|
356 |
language: Optional[str] = None, |
|
3551b80…
|
noreply
|
357 |
speaker_hints: Optional[list[str]] = None, |
|
3551b80…
|
noreply
|
358 |
) -> dict |
|
3551b80…
|
noreply
|
359 |
``` |
|
3551b80…
|
noreply
|
360 |
|
|
3551b80…
|
noreply
|
361 |
Transcribe audio. Prefers local Whisper (no file size limits, no API costs) when available, falling back to API-based transcription. |
|
3551b80…
|
noreply
|
362 |
|
|
3551b80…
|
noreply
|
363 |
**Parameters:** |
|
3551b80…
|
noreply
|
364 |
|
|
3551b80…
|
noreply
|
365 |
| Parameter | Type | Default | Description | |
|
3551b80…
|
noreply
|
366 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
367 |
| `audio_path` | `str \| Path` | *required* | Path to the audio file | |
|
3551b80…
|
noreply
|
368 |
| `language` | `Optional[str]` | `None` | Language hint | |
|
3551b80…
|
noreply
|
369 |
| `speaker_hints` | `Optional[list[str]]` | `None` | Speaker names for better recognition | |
|
3551b80…
|
noreply
|
370 |
|
|
3551b80…
|
noreply
|
371 |
**Returns:** `dict` -- transcription result with `text`, `segments`, `duration`. |
|
3551b80…
|
noreply
|
372 |
|
|
3551b80…
|
noreply
|
373 |
**Local Whisper:** If `transcription_model` is unset or starts with `"whisper-local"`, the manager tries local Whisper first. Use `"whisper-local:large"` to specify a model size. |
|
3551b80…
|
noreply
|
374 |
|
|
3551b80…
|
noreply
|
375 |
### get_models_used() |
|
3551b80…
|
noreply
|
376 |
|
|
3551b80…
|
noreply
|
377 |
```python |
|
3551b80…
|
noreply
|
378 |
def get_models_used(self) -> dict[str, str] |
|
3551b80…
|
noreply
|
379 |
``` |
|
3551b80…
|
noreply
|
380 |
|
|
3551b80…
|
noreply
|
381 |
Return a dict mapping capability to `"provider/model"` string for tracking purposes. |
|
3551b80…
|
noreply
|
382 |
|
|
3551b80…
|
noreply
|
383 |
```python |
|
3551b80…
|
noreply
|
384 |
pm = ProviderManager() |
|
3551b80…
|
noreply
|
385 |
print(pm.get_models_used()) |
|
3551b80…
|
noreply
|
386 |
# {"vision": "gemini/gemini-2.5-flash", "chat": "anthropic/claude-haiku-4-5-20251001", ...} |
|
3551b80…
|
noreply
|
387 |
``` |
|
3551b80…
|
noreply
|
388 |
|
|
3551b80…
|
noreply
|
389 |
### Usage examples |
|
3551b80…
|
noreply
|
390 |
|
|
3551b80…
|
noreply
|
391 |
```python |
|
3551b80…
|
noreply
|
392 |
from video_processor.providers.manager import ProviderManager |
|
3551b80…
|
noreply
|
393 |
|
|
3551b80…
|
noreply
|
394 |
# Auto-select best providers |
|
3551b80…
|
noreply
|
395 |
pm = ProviderManager() |
|
3551b80…
|
noreply
|
396 |
|
|
3551b80…
|
noreply
|
397 |
# Force everything through one provider |
|
3551b80…
|
noreply
|
398 |
pm = ProviderManager(provider="openai") |
|
3551b80…
|
noreply
|
399 |
|
|
3551b80…
|
noreply
|
400 |
# Explicit model selection |
|
3551b80…
|
noreply
|
401 |
pm = ProviderManager( |
|
3551b80…
|
noreply
|
402 |
vision_model="gpt-4o", |
|
3551b80…
|
noreply
|
403 |
chat_model="claude-haiku-4-5-20251001", |
|
3551b80…
|
noreply
|
404 |
transcription_model="whisper-local:large", |
|
3551b80…
|
noreply
|
405 |
) |
|
3551b80…
|
noreply
|
406 |
|
|
3551b80…
|
noreply
|
407 |
# Chat completion |
|
3551b80…
|
noreply
|
408 |
response = pm.chat([ |
|
3551b80…
|
noreply
|
409 |
{"role": "user", "content": "Summarize this meeting transcript..."} |
|
3551b80…
|
noreply
|
410 |
]) |
|
3551b80…
|
noreply
|
411 |
|
|
3551b80…
|
noreply
|
412 |
# Image analysis |
|
3551b80…
|
noreply
|
413 |
with open("diagram.png", "rb") as f: |
|
3551b80…
|
noreply
|
414 |
analysis = pm.analyze_image(f.read(), "Describe this architecture diagram") |
|
3551b80…
|
noreply
|
415 |
|
|
3551b80…
|
noreply
|
416 |
# Transcription with speaker hints |
|
3551b80…
|
noreply
|
417 |
result = pm.transcribe_audio( |
|
3551b80…
|
noreply
|
418 |
"meeting.mp3", |
|
3551b80…
|
noreply
|
419 |
language="en", |
|
3551b80…
|
noreply
|
420 |
speaker_hints=["Alice", "Bob", "Charlie"], |
|
3551b80…
|
noreply
|
421 |
) |
|
3551b80…
|
noreply
|
422 |
|
|
3551b80…
|
noreply
|
423 |
# Check usage |
|
3551b80…
|
noreply
|
424 |
print(pm.usage.summary()) |
|
3551b80…
|
noreply
|
425 |
``` |
|
3551b80…
|
noreply
|
426 |
|
|
3551b80…
|
noreply
|
427 |
--- |
|
3551b80…
|
noreply
|
428 |
|
|
3551b80…
|
noreply
|
429 |
## discover_available_models() |
|
3551b80…
|
noreply
|
430 |
|
|
3551b80…
|
noreply
|
431 |
```python |
|
3551b80…
|
noreply
|
432 |
from video_processor.providers.discovery import discover_available_models |
|
3551b80…
|
noreply
|
433 |
``` |
|
3551b80…
|
noreply
|
434 |
|
|
3551b80…
|
noreply
|
435 |
```python |
|
3551b80…
|
noreply
|
436 |
def discover_available_models( |
|
3551b80…
|
noreply
|
437 |
api_keys: Optional[dict[str, str]] = None, |
|
3551b80…
|
noreply
|
438 |
force_refresh: bool = False, |
|
3551b80…
|
noreply
|
439 |
) -> list[ModelInfo] |
|
3551b80…
|
noreply
|
440 |
``` |
|
3551b80…
|
noreply
|
441 |
|
|
3551b80…
|
noreply
|
442 |
Discover available models from all configured providers. For each provider with a valid API key, calls `list_models()` and returns a unified, sorted list. |
|
3551b80…
|
noreply
|
443 |
|
|
3551b80…
|
noreply
|
444 |
**Parameters:** |
|
3551b80…
|
noreply
|
445 |
|
|
3551b80…
|
noreply
|
446 |
| Parameter | Type | Default | Description | |
|
3551b80…
|
noreply
|
447 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
448 |
| `api_keys` | `Optional[dict[str, str]]` | `None` | Override API keys (defaults to environment variables) | |
|
3551b80…
|
noreply
|
449 |
| `force_refresh` | `bool` | `False` | Force re-discovery, ignoring the session cache | |
|
3551b80…
|
noreply
|
450 |
|
|
3551b80…
|
noreply
|
451 |
**Returns:** `list[ModelInfo]` -- all discovered models, sorted by provider then model ID. |
|
3551b80…
|
noreply
|
452 |
|
|
3551b80…
|
noreply
|
453 |
**Caching:** Results are cached for the session. Use `force_refresh=True` or `clear_discovery_cache()` to refresh. |
|
3551b80…
|
noreply
|
454 |
|
|
3551b80…
|
noreply
|
455 |
```python |
|
3551b80…
|
noreply
|
456 |
from video_processor.providers.discovery import ( |
|
3551b80…
|
noreply
|
457 |
discover_available_models, |
|
3551b80…
|
noreply
|
458 |
clear_discovery_cache, |
|
3551b80…
|
noreply
|
459 |
) |
|
3551b80…
|
noreply
|
460 |
|
|
3551b80…
|
noreply
|
461 |
# Discover models using environment variables |
|
3551b80…
|
noreply
|
462 |
models = discover_available_models() |
|
3551b80…
|
noreply
|
463 |
for m in models: |
|
3551b80…
|
noreply
|
464 |
print(f"{m.provider}/{m.id} - {m.capabilities}") |
|
3551b80…
|
noreply
|
465 |
|
|
3551b80…
|
noreply
|
466 |
# Force refresh |
|
3551b80…
|
noreply
|
467 |
models = discover_available_models(force_refresh=True) |
|
3551b80…
|
noreply
|
468 |
|
|
3551b80…
|
noreply
|
469 |
# Override API keys |
|
3551b80…
|
noreply
|
470 |
models = discover_available_models(api_keys={ |
|
3551b80…
|
noreply
|
471 |
"openai": "sk-...", |
|
3551b80…
|
noreply
|
472 |
"anthropic": "sk-ant-...", |
|
3551b80…
|
noreply
|
473 |
}) |
|
3551b80…
|
noreply
|
474 |
|
|
3551b80…
|
noreply
|
475 |
# Clear cache |
|
3551b80…
|
noreply
|
476 |
clear_discovery_cache() |
|
3551b80…
|
noreply
|
477 |
``` |
|
3551b80…
|
noreply
|
478 |
|
|
3551b80…
|
noreply
|
479 |
### clear_discovery_cache() |
|
3551b80…
|
noreply
|
480 |
|
|
3551b80…
|
noreply
|
481 |
```python |
|
3551b80…
|
noreply
|
482 |
def clear_discovery_cache() -> None |
|
3551b80…
|
noreply
|
483 |
``` |
|
3551b80…
|
noreply
|
484 |
|
|
3551b80…
|
noreply
|
485 |
Clear the cached model list, forcing the next `discover_available_models()` call to re-query providers. |
|
3551b80…
|
noreply
|
486 |
|
|
3551b80…
|
noreply
|
487 |
--- |
|
3551b80…
|
noreply
|
488 |
|
|
3551b80…
|
noreply
|
489 |
## Built-in Providers |
|
3551b80…
|
noreply
|
490 |
|
|
3551b80…
|
noreply
|
491 |
The following providers are registered automatically when the provider system initializes: |
|
3551b80…
|
noreply
|
492 |
|
|
3551b80…
|
noreply
|
493 |
| Provider | Environment Variable | Capabilities | Default Chat Model | |
|
3551b80…
|
noreply
|
494 |
|---|---|---|---| |
|
3551b80…
|
noreply
|
495 |
| `openai` | `OPENAI_API_KEY` | chat, vision, audio | `gpt-4o-mini` | |
|
3551b80…
|
noreply
|
496 |
| `anthropic` | `ANTHROPIC_API_KEY` | chat, vision | `claude-haiku-4-5-20251001` | |
|
3551b80…
|
noreply
|
497 |
| `gemini` | `GEMINI_API_KEY` | chat, vision, audio | `gemini-2.5-flash` | |
|
3551b80…
|
noreply
|
498 |
| `ollama` | *(none -- checks server)* | chat, vision | *(depends on installed models)* | |
|
3551b80…
|
noreply
|
499 |
| `together` | `TOGETHER_API_KEY` | chat | *(varies)* | |
|
3551b80…
|
noreply
|
500 |
| `fireworks` | `FIREWORKS_API_KEY` | chat | *(varies)* | |
|
3551b80…
|
noreply
|
501 |
| `cerebras` | `CEREBRAS_API_KEY` | chat | *(varies)* | |
|
3551b80…
|
noreply
|
502 |
| `xai` | `XAI_API_KEY` | chat | *(varies)* | |
|
3551b80…
|
noreply
|
503 |
| `azure` | `AZURE_OPENAI_API_KEY` | chat, vision | *(varies)* | |