MetaVision API Documentation
Introduction
MetaVision is an advanced Multi-Modal AI analysis engine with a built-in content optimization pipeline and optional audio transcription. All media is automatically processed through the pipeline before VLM analysis, ensuring optimal quality and token efficiency.
MetaVision uses an asynchronous queue-based architecture. Submit, then poll for results.
/api/v1/analyze. You receive a 202 Accepted with a job_id.optimizing). Metadata including codec info, face detection, and thumbnails are extracted.transcribe: true, audio is transcribed with optional diarization (status: transcribing).analyzing).completed, the response includes the full analysis results, transcript, pipeline metadata, and asset URLs.Authentication
All API requests require a Bearer token:
Authorization: Bearer YOUR_API_KEY
Submit Analysis
POST /api/v1/analyze
Media Source (one required)
| Parameter | Type | Description |
|---|---|---|
media | string | A publicly accessible HTTP URL pointing to media. |
media_base64 | string | Raw base64-encoded media data. Max 2MB. Requires media_type. |
media_type | string | Required. The type of media: image, video, or audio. |
Optional Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
categories | array[string] | (All) | List of Category IDs to execute. |
custom_category | string | null | Custom analysis instruction (max 5000 chars). |
system_prompt | string | null | Override system prompt. Use {{CATEGORY_PROMPT}} placeholder. |
model | string | null | Force a specific Agent UUID. |
detail | string | "balanced" | Quality profile ID controlling optimization and billing. |
transcribe | boolean | false | Enable audio transcription (audio/video only). Billed as flat +1× price. |
transcribe_diarize | boolean | false | Enable speaker diarization in transcription. |
transcribe_align | boolean | false | Enable word-level timestamp alignment in transcription. |
auth | object | null | Credentials for fetching protected media URLs. Only valid with media (URL) and when the optimization pipeline is enabled. See Authenticated Sources. |
Submit Response (202 Accepted)
{
"job_id": "a1b2c3d4-...",
"status": "queued",
"created_at": "2024-03-20T10:00:00.000Z",
"poll_url": "/api/v1/jobs/a1b2c3d4-..."
}
Poll for Results
GET /api/v1/jobs/{job_id}
Response Statuses
| Status | Description |
|---|---|
queued | Job is waiting in the queue. Keep polling. |
optimizing | Media is being processed by the optimization pipeline. Keep polling. |
transcribing | Audio is being transcribed. Keep polling. |
analyzing | Optimized content is being analyzed by VLM agents. Keep polling. |
completed | Analysis complete. Response includes meta, data, pipeline_info, and pipeline_assets. |
failed | Job failed. Response includes error. |
Completed Response
{
"job_id": "a1b2c3d4-...",
"status": "completed",
"meta": {
"request_id": "a1b2c3d4-...",
"model_name": "general-v2",
"execution_time": 8.45,
"successful_categories": 2,
"failed_categories": 0,
"total_tokens_in": 1500,
"total_tokens_out": 300,
"estimated_cost": 0.0045
},
"data": {
"title": { "result": "Sunset over a mountain range" },
"custom_category": { "result": "The mood is serene." },
"transcript": {
"segments": [
{ "start": 0.0, "end": 2.5, "text": "Hello world", "speaker": "SPEAKER_01" }
],
"detected_language": "en",
"language_probability": 0.997
}
},
"pipeline_info": {
"metadata": {
"streams": [
{ "codec_type": "video", "codec_name": "h264", "width": 1280, "height": 720 },
{ "codec_type": "audio", "codec_name": "aac", "sample_rate": "44100", "channels": 2 }
],
"format": { "format_name": "mov,mp4", "duration": "219.286", "size": "82989735" }
},
"faces": {
"faces": [
{ "id": 1, "file": "face_001.jpg", "appearances": 3, "first_seen_time": "00:00:04" }
]
},
"download_info": {
"title": "Video Title",
"description": "Video description...",
"thumbnail": "https://example.com/thumb.jpg",
"duration": 219.0,
"webpage_url": "https://example.com/video"
}
},
"pipeline_assets": {
"thumbnail.jpg": "https://s3.../thumbnail.jpg",
"waveform.png": "https://s3.../waveform.png",
"faces/face_001.jpg": "https://s3.../faces/face_001.jpg",
"thumbnails/0001.jpg": "https://s3.../thumbnails/0001.jpg"
}
}
The transcript key only appears in data when transcribe: true was set. The pipeline_info object contains media metadata extracted by the optimization pipeline (codec info, face detection, source metadata). The pipeline_assets object contains URLs to generated assets like thumbnails, waveform, and face crops.
Account Balance
Retrieve your usage statistics grouped by time period.
GET /api/v1/account/balance
Optional Query Parameters
| Parameter | Type | Description |
|---|---|---|
from | string | Start date for custom range (YYYY-MM-DD) |
to | string | End date for custom range (YYYY-MM-DD) |
Response
{
"api_key_prefix": "e59e5c0d-...",
"description": "Production App",
"overall": {
"total_requests": 1250,
"total_success": 1200,
"total_failed": 50,
"total_billed": 12.50
},
"today": { ... },
"week": { ... },
"month": { ... },
"breakdown_by_type": [
{ "media_type": "image", "request_count": 800, "success_count": 790, "failed_count": 10, "total_billed": 4.00 },
{ "media_type": "video", "request_count": 450, "success_count": 410, "failed_count": 40, "total_billed": 8.50 }
],
"custom_range": null
}
When from and to are provided, the custom_range field contains the aggregated stats for that date range.
Pricing
Retrieve your configured per-category prices and available quality profiles.
GET /api/v1/account/prices
Response
{
"price_per_image": 0.005,
"price_per_video": 0.01,
"price_per_audio": 0.008,
"quality_profiles": [
{ "id": "low", "name": "Low", "billing_multiplier": 1 },
{ "id": "balanced", "name": "Balanced", "billing_multiplier": 1 },
{ "id": "high", "name": "High", "billing_multiplier": 2 },
{ "id": "ultra", "name": "Ultra", "billing_multiplier": 4 }
],
"billing_note": "Each category is billed at price_per_type × quality_multiplier. Transcription adds a flat 1× price_per_type regardless of quality level."
}
Request History
List your past analysis requests with pagination and filters. You can also fetch full results or cancel queued jobs.
List Requests
GET /api/v1/account/requests
Query Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
page | integer | 1 | Page number |
limit | integer | 50 | Results per page (max 100) |
status | string | all | Filter: success, partial, failed |
media_type | string | all | Filter: image, video, audio |
from | string | - | Start date (YYYY-MM-DD) |
to | string | - | End date (YYYY-MM-DD) |
Response
{
"requests": [
{
"request_id": "a1b2c3d4-...",
"status": "success",
"media_type": "video",
"categories": "[\"title\",\"description\"]",
"quality_profile": "balanced",
"billed_amount": 0.02,
"execution_time": 8.4,
"successful_categories": 2,
"failed_categories": 0,
"tokens_in": 1500,
"tokens_out": 300,
"agent_name": "general-v3",
"created_at": "2024-03-20T10:00:00.000Z"
}
],
"total": 156,
"page": 1,
"limit": 50
}
Get Request Detail
GET /api/v1/account/requests/{request_id}
Returns the full analysis result including all category data, pipeline info, and assets. Checks active jobs first (KV), then falls back to the log database (D1) for completed historical requests.
Response (Completed)
{
"request_id": "a1b2c3d4-...",
"status": "completed",
"media_type": "video",
"agent_name": "general-v3",
"categories": ["title", "description"],
"quality_profile": "balanced",
"created_at": "...",
"meta": {
"execution_time": 8.45,
"tokens_in": 1500,
"tokens_out": 300,
"billed_amount": 0.02,
"successful_categories": 2,
"failed_categories": 0
},
"data": {
"title": { "result": "..." },
"description": { "result": "..." }
},
"pipeline_info": { ... },
"pipeline_assets": {
"thumbnail.jpg": "https://s3.../thumbnail.jpg",
"waveform.png": "https://s3.../waveform.png"
}
}
Cancel a Queued Job
POST /api/v1/account/requests/{request_id}/cancel
Cancel a job that is still in queued status. Jobs that have already started processing cannot be cancelled.
Response
{
"success": true,
"job_id": "a1b2c3d4-...",
"status": "cancelled"
}
Authenticated Sources
When your media URL requires authentication (private YouTube content, paywalled videos, CDN-protected assets, S3 links that require custom headers, etc.) you can attach credentials via the optional auth object. These are forwarded to the optimization pipeline which performs the actual download.
The auth field is only valid when used together with media (URL). It is rejected if media_base64 is used. It also requires the optimization pipeline to be enabled on the server. Credentials are stored encrypted in memory with a short TTL and are never written to request logs or admin-visible records.
Structure
{
"media": "https://protected.example.com/video/12345",
"media_type": "video",
"auth": {
"cookies_txt": "# Netscape HTTP Cookie File\n.example.com\tTRUE\t/\tTRUE\t0\tsession\tabc123\n...",
"cookie_string": "session=abc123; token=xyz789",
"headers": {
"X-API-Key": "my-secret",
"Accept-Language": "en-US",
"Referer": "https://protected.example.com/"
}
}
}
Fields
| Field | Type | Max Size | Description |
|---|---|---|---|
auth.cookies_txt | string | 64 KB | Full contents of a Netscape-format cookies.txt file. Highest priority when multiple auth fields are set. |
auth.cookie_string | string | 8 KB | Raw Cookie header value (e.g. session=abc; token=xyz). Used only if cookies_txt is not provided. |
auth.headers | object | 20 entries, value ≤ 4 KB | Map of extra HTTP headers sent with the download request. Can override User-Agent and Referer. Header names must be valid RFC 7230 tokens; values must not contain CR, LF, or NUL characters. |
Validation errors
- Auth fields over the size limit are rejected with
invalid_request. - Header names that do not match RFC 7230 token syntax are rejected.
- Header or cookie values containing control characters (CR, LF, NUL) are rejected to prevent header injection.
- Providing
authwithmedia_base64returns an error — base64 uploads do not require a fetch step.
Transcription
MetaVision includes an integrated speech transcription pipeline for audio and video content. When enabled, the transcript is automatically provided to VLM agents for richer context, and included in the response.
Features
- Automatic language detection
- Word-level timestamp alignment (optional)
- Speaker diarization with speaker identification (optional)
- Configurable voice activity detection
How it works
Set transcribe: true on your request. The pipeline runs after media optimization and before VLM analysis. The transcript text is included in the VLM prompt so the model has both the media and the spoken content.
Billing
Transcription is billed as a flat 1× price_per_type per request, regardless of the quality profile multiplier. For example, if you request 3 categories with a 2× quality profile plus transcription, billing = (3 × price × 2) + (1 × price).
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
transcribe | boolean | false | Enable transcription. Only for audio and video. |
transcribe_diarize | boolean | false | Assign speaker IDs to segments. |
transcribe_align | boolean | false | Enable word-level timestamp alignment. |
Transcript Response Format
{
"transcript": {
"segments": [
{
"start": 0.0,
"end": 2.5,
"text": "Hello, how are you?",
"speaker": "SPEAKER_01",
"words": [
{"word": "Hello,", "start": 0.1, "end": 0.5, "speaker": "SPEAKER_01"},
{"word": "how", "start": 0.6, "end": 0.9, "speaker": "SPEAKER_01"}
]
}
],
"detected_language": "en",
"language_probability": 0.997,
"speakers": {
"SPEAKER_01": {"name": "SPEAKER_01", "time": 2.5}
}
}
}
The words array is only present when transcribe_align: true. The speaker field and speakers object are only present when transcribe_diarize: true.
Pipeline Info & Assets
When media is processed through the optimization pipeline, the completed response includes two additional fields:
pipeline_info
A JSON object containing rich metadata extracted during optimization. The structure varies by media type and source, but commonly includes:
metadata.streams[]— Video and audio stream details (codec, resolution, framerate, sample rate, channels, bitrate)metadata.format— Container format info (format name, duration, file size, overall bitrate, tags/title/description)faces— Face detection results including unique face IDs, appearance counts, and timestamps of first/last appearancedownload_info— Source URL metadata (title, description, thumbnail, webpage URL, duration) when media was downloaded from a web URL
Not all fields are present for every request. Image analysis won't have video streams or faces. Direct URL uploads may not have download_info. Always check for the existence of fields before accessing them.
pipeline_assets
A map of filename → URL for generated assets. Common assets include:
| Key Pattern | Description |
|---|---|
thumbnail.jpg | Main thumbnail image |
thumbnails/0001.jpg … | Scene thumbnails extracted from video |
waveform.png | Audio waveform visualization |
faces/face_001.jpg … | Cropped face images from face detection |
Quality Profiles
The detail parameter selects a quality profile that controls how media is optimized and how billing is calculated.
Each category counts as billing_multiplier × price_per_category. Transcription is always billed at a flat 1× price regardless of the quality profile.
Analysis Categories
Available Models
Error Handling
| Status | Code | Description |
|---|---|---|
| 400 | invalid_request | Malformed JSON, missing media, invalid URL or base64, invalid model UUID, invalid quality profile, invalid auth fields, or transcription requested for image media. |
| 400 | invalid_categories | Requested categories do not exist or are disabled. |
| 401 | invalid_api_key | Missing or incorrect Authorization header. |
| 403 | account_disabled | API key disabled. |
| 404 | not_found | Job not found or expired. |
| 413 | invalid_request | Request body exceeds maximum size (5 MB). |
| 429 | rate_limited | Rate limit exceeded. Check Retry-After header. |
| 500 | internal_error | Server, pipeline, or transcription configuration error. |
Code Examples
import requests, time, base64
API_KEY = "your_api_key"
BASE = "https://metavision.vip.api.efficientstack.com/api/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
# URL-based submission with transcription
payload = {
"media": "https://example.com/video.mp4",
"media_type": "video",
"categories": ["title", "description"],
"detail": "balanced",
"transcribe": True,
"transcribe_diarize": True,
"transcribe_align": False
}
resp = requests.post(f"{BASE}/analyze", json=payload, headers=HEADERS)
job = resp.json()
print(f"Job: {job['job_id']}")
while True:
result = requests.get(f"{BASE}/jobs/{job['job_id']}", headers=HEADERS).json()
print(f"Status: {result['status']}")
if result["status"] == "completed":
print("Results:", result["data"])
if "transcript" in result["data"]:
print("Transcript:", result["data"]["transcript"])
if result.get("pipeline_info"):
print("Pipeline Info:", result["pipeline_info"])
if result.get("pipeline_assets"):
print("Assets:", result["pipeline_assets"])
break
elif result["status"] == "failed":
print("Error:", result.get("error"))
break
time.sleep(2)
# Protected source with cookies and custom headers
protected_payload = {
"media": "https://protected.example.com/video/12345",
"media_type": "video",
"categories": ["description"],
"auth": {
"cookie_string": "session=abc123; auth_token=xyz",
"headers": {
"X-API-Key": "my-api-key",
"Referer": "https://protected.example.com/"
}
}
}
requests.post(f"{BASE}/analyze", json=protected_payload, headers=HEADERS)
# Protected source with a cookies.txt file (Netscape format)
with open("cookies.txt") as f:
cookies_content = f.read()
requests.post(f"{BASE}/analyze", json={
"media": "https://youtube.com/watch?v=PRIVATE",
"media_type": "video",
"auth": {"cookies_txt": cookies_content}
}, headers=HEADERS)
# Check account balance
balance = requests.get(f"{BASE}/account/balance", headers=HEADERS).json()
print(f"Total billed: ${balance['overall']['total_billed']}")
# Check prices
prices = requests.get(f"{BASE}/account/prices", headers=HEADERS).json()
print(f"Image price: ${prices['price_per_image']}/category")
# Browse request history
history = requests.get(f"{BASE}/account/requests?limit=10", headers=HEADERS).json()
for req in history["requests"]:
print(f"{req['request_id'][:12]}... {req['status']} {req['media_type']}")
# Get full result for a past request (includes pipeline_info and pipeline_assets)
detail = requests.get(f"{BASE}/account/requests/{job['job_id']}", headers=HEADERS).json()
print("Full result:", detail)
# Cancel a queued job
cancel = requests.post(f"{BASE}/account/requests/{job['job_id']}/cancel", headers=HEADERS).json()
print("Cancel:", cancel)
const API_KEY = 'your_api_key';
const BASE = 'https://metavision.vip.api.efficientstack.com/api/v1';
const HEADERS = {'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json'};
async function analyzeMedia() {
const resp = await fetch(`${BASE}/analyze`, {
method: 'POST', headers: HEADERS,
body: JSON.stringify({
media: 'https://example.com/video.mp4',
media_type: 'video',
categories: ['title', 'description'],
detail: 'balanced',
transcribe: true,
transcribe_diarize: true
})
});
const job = await resp.json();
while (true) {
const result = await (await fetch(`${BASE}/jobs/${job.job_id}`, {headers: HEADERS})).json();
if (result.status === 'completed') {
console.log('Data:', result.data);
console.log('Pipeline Info:', result.pipeline_info);
console.log('Assets:', result.pipeline_assets);
return result;
}
if (result.status === 'failed') throw new Error(result.error);
await new Promise(r => setTimeout(r, 2000));
}
}
// Protected source with auth
async function analyzeProtected() {
const resp = await fetch(`${BASE}/analyze`, {
method: 'POST', headers: HEADERS,
body: JSON.stringify({
media: 'https://protected.example.com/video/12345',
media_type: 'video',
auth: {
cookie_string: 'session=abc; token=xyz',
headers: {
'X-API-Key': 'my-secret',
'Referer': 'https://protected.example.com/'
}
}
})
});
return resp.json();
}
// Account balance
async function getBalance() {
return (await fetch(`${BASE}/account/balance`, {headers: HEADERS})).json();
}
// Prices
async function getPrices() {
return (await fetch(`${BASE}/account/prices`, {headers: HEADERS})).json();
}
// Request history
async function getHistory(page = 1) {
return (await fetch(`${BASE}/account/requests?page=${page}`, {headers: HEADERS})).json();
}
// Request detail (includes pipeline_info and pipeline_assets)
async function getRequestDetail(id) {
return (await fetch(`${BASE}/account/requests/${id}`, {headers: HEADERS})).json();
}
// Cancel a queued job
async function cancelJob(id) {
return (await fetch(`${BASE}/account/requests/${id}/cancel`, {method:'POST', headers: HEADERS})).json();
}
analyzeMedia();
# Submit analysis
JOB=$(curl -s -X POST "https://metavision.vip.api.efficientstack.com/api/v1/analyze" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"media":"https://example.com/video.mp4","media_type":"video","categories":["title"],"detail":"balanced","transcribe":true,"transcribe_diarize":true}')
JOB_ID=$(echo $JOB | jq -r '.job_id')
while true; do
RESULT=$(curl -s "https://metavision.vip.api.efficientstack.com/api/v1/jobs/$JOB_ID" -H "Authorization: Bearer YOUR_API_KEY")
STATUS=$(echo $RESULT | jq -r '.status')
echo "Status: $STATUS"
if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then echo $RESULT | jq .; break; fi
sleep 2
done
# Submit a protected URL with cookies and custom headers
curl -X POST "https://metavision.vip.api.efficientstack.com/api/v1/analyze" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"media": "https://protected.example.com/video/12345",
"media_type": "video",
"categories": ["description"],
"auth": {
"cookie_string": "session=abc; token=xyz",
"headers": {
"X-API-Key": "my-secret",
"Referer": "https://protected.example.com/"
}
}
}'
# View pipeline info from completed result
echo $RESULT | jq '.pipeline_info'
echo $RESULT | jq '.pipeline_assets'
# Account balance
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/balance" -H "Authorization: Bearer YOUR_API_KEY" | jq .
# Account balance with date range
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/balance?from=2024-01-01&to=2024-03-31" -H "Authorization: Bearer YOUR_API_KEY" | jq .
# Prices
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/prices" -H "Authorization: Bearer YOUR_API_KEY" | jq .
# Request history
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/requests?page=1&limit=10" -H "Authorization: Bearer YOUR_API_KEY" | jq .
# Request detail (includes pipeline_info and pipeline_assets)
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/requests/$JOB_ID" -H "Authorization: Bearer YOUR_API_KEY" | jq .
# Cancel a queued job
curl -s -X POST "https://metavision.vip.api.efficientstack.com/api/v1/account/requests/$JOB_ID/cancel" -H "Authorization: Bearer YOUR_API_KEY" | jq .