MetaVision API Documentation

Introduction

MetaVision is an advanced Multi-Modal AI analysis engine with a built-in content optimization pipeline and optional audio transcription. All media is automatically processed through the pipeline before VLM analysis, ensuring optimal quality and token efficiency.

MetaVision uses an asynchronous queue-based architecture. Submit, then poll for results.

Submit: POST your media (URL or base64) and categories to /api/v1/analyze. You receive a 202 Accepted with a job_id.

Optimize: Your media is automatically processed through the optimization pipeline (status: optimizing). Metadata including codec info, face detection, and thumbnails are extracted.

Transcribe: If transcribe: true, audio is transcribed with optional diarization (status: transcribing).

Analyze: Optimized content (and transcript) is analyzed by VLM agents (status: analyzing).

Receive: When the job status is completed, the response includes the full analysis results, transcript, pipeline metadata, and asset URLs.

Try the Interactive Playground

Authentication

All API requests require a Bearer token:

Authorization: Bearer YOUR_API_KEY

Submit Analysis

POST /api/v1/analyze

Media Source (one required)

Parameter	Type	Description
`media`	string	A publicly accessible HTTP URL pointing to media.
`media_base64`	string	Raw base64-encoded media data. Max 2MB. Requires `media_type`.
`media_type`	string	Required. The type of media: `image`, `video`, or `audio`.

Optional Parameters

Parameter	Type	Default	Description
`categories`	array[string]	(All)	List of Category IDs to execute.
`custom_category`	string	null	Custom analysis instruction (max 5000 chars).
`system_prompt`	string	null	Override system prompt. Use `{{CATEGORY_PROMPT}}` placeholder.
`model`	string	null	Force a specific Agent UUID.
`detail`	string	"balanced"	Quality profile ID controlling optimization and billing.
`transcribe`	boolean	false	Enable audio transcription (audio/video only). Billed as flat +1× price.
`transcribe_diarize`	boolean	false	Enable speaker diarization in transcription.
`transcribe_align`	boolean	false	Enable word-level timestamp alignment in transcription.
`auth`	object	null	Credentials for fetching protected media URLs. Only valid with `media` (URL) and when the optimization pipeline is enabled. See Authenticated Sources.

Submit Response (202 Accepted)

{
"job_id": "a1b2c3d4-...",
"status": "queued",
"created_at": "2024-03-20T10:00:00.000Z",
"poll_url": "/api/v1/jobs/a1b2c3d4-..."
}

Poll for Results

GET /api/v1/jobs/{job_id}

Response Statuses

Status	Description
`queued`	Job is waiting in the queue. Keep polling.
`optimizing`	Media is being processed by the optimization pipeline. Keep polling.
`transcribing`	Audio is being transcribed. Keep polling.
`analyzing`	Optimized content is being analyzed by VLM agents. Keep polling.
`completed`	Analysis complete. Response includes `meta`, `data`, `pipeline_info`, and `pipeline_assets`.
`failed`	Job failed. Response includes `error`.

Completed Response

{
"job_id": "a1b2c3d4-...",
"status": "completed",
"meta": {
  "request_id": "a1b2c3d4-...",
  "model_name": "general-v2",
  "execution_time": 8.45,
  "successful_categories": 2,
  "failed_categories": 0,
  "total_tokens_in": 1500,
  "total_tokens_out": 300,
  "estimated_cost": 0.0045
},
"data": {
  "title": { "result": "Sunset over a mountain range" },
  "custom_category": { "result": "The mood is serene." },
  "transcript": {
    "segments": [
      { "start": 0.0, "end": 2.5, "text": "Hello world", "speaker": "SPEAKER_01" }
    ],
    "detected_language": "en",
    "language_probability": 0.997
  }
},
"pipeline_info": {
  "metadata": {
    "streams": [
      { "codec_type": "video", "codec_name": "h264", "width": 1280, "height": 720 },
      { "codec_type": "audio", "codec_name": "aac", "sample_rate": "44100", "channels": 2 }
    ],
    "format": { "format_name": "mov,mp4", "duration": "219.286", "size": "82989735" }
  },
  "faces": {
    "faces": [
      { "id": 1, "file": "face_001.jpg", "appearances": 3, "first_seen_time": "00:00:04" }
    ]
  },
  "download_info": {
    "title": "Video Title",
    "description": "Video description...",
    "thumbnail": "https://example.com/thumb.jpg",
    "duration": 219.0,
    "webpage_url": "https://example.com/video"
  }
},
"pipeline_assets": {
  "thumbnail.jpg": "https://s3.../thumbnail.jpg",
  "waveform.png": "https://s3.../waveform.png",
  "faces/face_001.jpg": "https://s3.../faces/face_001.jpg",
  "thumbnails/0001.jpg": "https://s3.../thumbnails/0001.jpg"
}
}

Note

The transcript key only appears in data when transcribe: true was set. The pipeline_info object contains media metadata extracted by the optimization pipeline (codec info, face detection, source metadata). The pipeline_assets object contains URLs to generated assets like thumbnails, waveform, and face crops.

Account Balance

Retrieve your usage statistics grouped by time period.

GET /api/v1/account/balance

Optional Query Parameters

Parameter	Type	Description
`from`	string	Start date for custom range (YYYY-MM-DD)
`to`	string	End date for custom range (YYYY-MM-DD)

Response

{
"api_key_prefix": "e59e5c0d-...",
"description": "Production App",
"overall": {
  "total_requests": 1250,
  "total_success": 1200,
  "total_failed": 50,
  "total_billed": 12.50
},
"today": { ... },
"week": { ... },
"month": { ... },
"breakdown_by_type": [
  { "media_type": "image", "request_count": 800, "success_count": 790, "failed_count": 10, "total_billed": 4.00 },
  { "media_type": "video", "request_count": 450, "success_count": 410, "failed_count": 40, "total_billed": 8.50 }
],
"custom_range": null
}

When from and to are provided, the custom_range field contains the aggregated stats for that date range.

Pricing

Retrieve your configured per-category prices and available quality profiles.

GET /api/v1/account/prices

Response

{
"price_per_image": 0.005,
"price_per_video": 0.01,
"price_per_audio": 0.008,
"quality_profiles": [
  { "id": "low", "name": "Low", "billing_multiplier": 1 },
  { "id": "balanced", "name": "Balanced", "billing_multiplier": 1 },
  { "id": "high", "name": "High", "billing_multiplier": 2 },
  { "id": "ultra", "name": "Ultra", "billing_multiplier": 4 }
],
"billing_note": "Each category is billed at price_per_type × quality_multiplier. Transcription adds a flat 1× price_per_type regardless of quality level."
}

Request History

List your past analysis requests with pagination and filters. You can also fetch full results or cancel queued jobs.

List Requests

GET /api/v1/account/requests

Query Parameters

Parameter	Type	Default	Description
`page`	integer	1	Page number
`limit`	integer	50	Results per page (max 100)
`status`	string	all	Filter: `success`, `partial`, `failed`
`media_type`	string	all	Filter: `image`, `video`, `audio`
`from`	string	-	Start date (YYYY-MM-DD)
`to`	string	-	End date (YYYY-MM-DD)

Response

{
"requests": [
  {
    "request_id": "a1b2c3d4-...",
    "status": "success",
    "media_type": "video",
    "categories": "[\"title\",\"description\"]",
    "quality_profile": "balanced",
    "billed_amount": 0.02,
    "execution_time": 8.4,
    "successful_categories": 2,
    "failed_categories": 0,
    "tokens_in": 1500,
    "tokens_out": 300,
    "agent_name": "general-v3",
    "created_at": "2024-03-20T10:00:00.000Z"
  }
],
"total": 156,
"page": 1,
"limit": 50
}

Get Request Detail

GET /api/v1/account/requests/{request_id}

Returns the full analysis result including all category data, pipeline info, and assets. Checks active jobs first (KV), then falls back to the log database (D1) for completed historical requests.

Response (Completed)

{
"request_id": "a1b2c3d4-...",
"status": "completed",
"media_type": "video",
"agent_name": "general-v3",
"categories": ["title", "description"],
"quality_profile": "balanced",
"created_at": "...",
"meta": {
  "execution_time": 8.45,
  "tokens_in": 1500,
  "tokens_out": 300,
  "billed_amount": 0.02,
  "successful_categories": 2,
  "failed_categories": 0
},
"data": {
  "title": { "result": "..." },
  "description": { "result": "..." }
},
"pipeline_info": { ... },
"pipeline_assets": {
  "thumbnail.jpg": "https://s3.../thumbnail.jpg",
  "waveform.png": "https://s3.../waveform.png"
}
}

Cancel a Queued Job

POST /api/v1/account/requests/{request_id}/cancel

Cancel a job that is still in queued status. Jobs that have already started processing cannot be cancelled.

Response

{
"success": true,
"job_id": "a1b2c3d4-...",
"status": "cancelled"
}

Authenticated Sources

When your media URL requires authentication (private YouTube content, paywalled videos, CDN-protected assets, S3 links that require custom headers, etc.) you can attach credentials via the optional auth object. These are forwarded to the optimization pipeline which performs the actual download.

Important

The auth field is only valid when used together with media (URL). It is rejected if media_base64 is used. It also requires the optimization pipeline to be enabled on the server. Credentials are stored encrypted in memory with a short TTL and are never written to request logs or admin-visible records.

Structure

{
"media": "https://protected.example.com/video/12345",
"media_type": "video",
"auth": {
  "cookies_txt": "# Netscape HTTP Cookie File\n.example.com\tTRUE\t/\tTRUE\t0\tsession\tabc123\n...",
  "cookie_string": "session=abc123; token=xyz789",
  "headers": {
    "X-API-Key": "my-secret",
    "Accept-Language": "en-US",
    "Referer": "https://protected.example.com/"
  }
}
}

Fields

Field	Type	Max Size	Description
`auth.cookies_txt`	string	64 KB	Full contents of a Netscape-format `cookies.txt` file. Highest priority when multiple auth fields are set.
`auth.cookie_string`	string	8 KB	Raw `Cookie` header value (e.g. `session=abc; token=xyz`). Used only if `cookies_txt` is not provided.
`auth.headers`	object	20 entries, value ≤ 4 KB	Map of extra HTTP headers sent with the download request. Can override `User-Agent` and `Referer`. Header names must be valid RFC 7230 tokens; values must not contain CR, LF, or NUL characters.

Validation errors

Auth fields over the size limit are rejected with invalid_request.
Header names that do not match RFC 7230 token syntax are rejected.
Header or cookie values containing control characters (CR, LF, NUL) are rejected to prevent header injection.
Providing auth with media_base64 returns an error — base64 uploads do not require a fetch step.

Transcription

MetaVision includes an integrated speech transcription pipeline for audio and video content. When enabled, the transcript is automatically provided to VLM agents for richer context, and included in the response.

Features

Automatic language detection
Word-level timestamp alignment (optional)
Speaker diarization with speaker identification (optional)
Configurable voice activity detection

How it works

Set transcribe: true on your request. The pipeline runs after media optimization and before VLM analysis. The transcript text is included in the VLM prompt so the model has both the media and the spoken content.

Billing

Transcription is billed as a flat 1× price_per_type per request, regardless of the quality profile multiplier. For example, if you request 3 categories with a 2× quality profile plus transcription, billing = (3 × price × 2) + (1 × price).

Parameters

Parameter	Type	Default	Description
`transcribe`	boolean	false	Enable transcription. Only for audio and video.
`transcribe_diarize`	boolean	false	Assign speaker IDs to segments.
`transcribe_align`	boolean	false	Enable word-level timestamp alignment.

Transcript Response Format

{
"transcript": {
  "segments": [
    {
      "start": 0.0,
      "end": 2.5,
      "text": "Hello, how are you?",
      "speaker": "SPEAKER_01",
      "words": [
        {"word": "Hello,", "start": 0.1, "end": 0.5, "speaker": "SPEAKER_01"},
        {"word": "how", "start": 0.6, "end": 0.9, "speaker": "SPEAKER_01"}
      ]
    }
  ],
  "detected_language": "en",
  "language_probability": 0.997,
  "speakers": {
    "SPEAKER_01": {"name": "SPEAKER_01", "time": 2.5}
  }
}
}

Note

The words array is only present when transcribe_align: true. The speaker field and speakers object are only present when transcribe_diarize: true.

Pipeline Info & Assets

When media is processed through the optimization pipeline, the completed response includes two additional fields:

`pipeline_info`

A JSON object containing rich metadata extracted during optimization. The structure varies by media type and source, but commonly includes:

metadata.streams[] — Video and audio stream details (codec, resolution, framerate, sample rate, channels, bitrate)
metadata.format — Container format info (format name, duration, file size, overall bitrate, tags/title/description)
faces — Face detection results including unique face IDs, appearance counts, and timestamps of first/last appearance
download_info — Source URL metadata (title, description, thumbnail, webpage URL, duration) when media was downloaded from a web URL

Note

Not all fields are present for every request. Image analysis won't have video streams or faces. Direct URL uploads may not have download_info. Always check for the existence of fields before accessing them.

`pipeline_assets`

A map of filename → URL for generated assets. Common assets include:

Key Pattern	Description
`thumbnail.jpg`	Main thumbnail image
`thumbnails/0001.jpg` …	Scene thumbnails extracted from video
`waveform.png`	Audio waveform visualization
`faces/face_001.jpg` …	Cropped face images from face detection

Quality Profiles

The detail parameter selects a quality profile that controls how media is optimized and how billing is calculated.

Billing

Each category counts as billing_multiplier × price_per_category. Transcription is always billed at a flat 1× price regardless of the quality profile.

Analysis Categories

Available Models

Error Handling

Status	Code	Description
400	`invalid_request`	Malformed JSON, missing media, invalid URL or base64, invalid model UUID, invalid quality profile, invalid auth fields, or transcription requested for image media.
400	`invalid_categories`	Requested categories do not exist or are disabled.
401	`invalid_api_key`	Missing or incorrect Authorization header.
403	`account_disabled`	API key disabled.
404	`not_found`	Job not found or expired.
413	`invalid_request`	Request body exceeds maximum size (5 MB).
429	`rate_limited`	Rate limit exceeded. Check `Retry-After` header.
500	`internal_error`	Server, pipeline, or transcription configuration error.

Code Examples

import requests, time, base64

API_KEY = "your_api_key"
BASE = "https://metavision.vip.api.efficientstack.com/api/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

# URL-based submission with transcription
payload = {
  "media": "https://example.com/video.mp4",
  "media_type": "video",
  "categories": ["title", "description"],
  "detail": "balanced",
  "transcribe": True,
  "transcribe_diarize": True,
  "transcribe_align": False
}

resp = requests.post(f"{BASE}/analyze", json=payload, headers=HEADERS)
job = resp.json()
print(f"Job: {job['job_id']}")

while True:
  result = requests.get(f"{BASE}/jobs/{job['job_id']}", headers=HEADERS).json()
  print(f"Status: {result['status']}")
  if result["status"] == "completed":
      print("Results:", result["data"])
      if "transcript" in result["data"]:
          print("Transcript:", result["data"]["transcript"])
      if result.get("pipeline_info"):
          print("Pipeline Info:", result["pipeline_info"])
      if result.get("pipeline_assets"):
          print("Assets:", result["pipeline_assets"])
      break
  elif result["status"] == "failed":
      print("Error:", result.get("error"))
      break
  time.sleep(2)

# Protected source with cookies and custom headers
protected_payload = {
  "media": "https://protected.example.com/video/12345",
  "media_type": "video",
  "categories": ["description"],
  "auth": {
      "cookie_string": "session=abc123; auth_token=xyz",
      "headers": {
          "X-API-Key": "my-api-key",
          "Referer": "https://protected.example.com/"
      }
  }
}
requests.post(f"{BASE}/analyze", json=protected_payload, headers=HEADERS)

# Protected source with a cookies.txt file (Netscape format)
with open("cookies.txt") as f:
  cookies_content = f.read()
requests.post(f"{BASE}/analyze", json={
  "media": "https://youtube.com/watch?v=PRIVATE",
  "media_type": "video",
  "auth": {"cookies_txt": cookies_content}
}, headers=HEADERS)

# Check account balance
balance = requests.get(f"{BASE}/account/balance", headers=HEADERS).json()
print(f"Total billed: ${balance['overall']['total_billed']}")

# Check prices
prices = requests.get(f"{BASE}/account/prices", headers=HEADERS).json()
print(f"Image price: ${prices['price_per_image']}/category")

# Browse request history
history = requests.get(f"{BASE}/account/requests?limit=10", headers=HEADERS).json()
for req in history["requests"]:
  print(f"{req['request_id'][:12]}... {req['status']} {req['media_type']}")

# Get full result for a past request (includes pipeline_info and pipeline_assets)
detail = requests.get(f"{BASE}/account/requests/{job['job_id']}", headers=HEADERS).json()
print("Full result:", detail)

# Cancel a queued job
cancel = requests.post(f"{BASE}/account/requests/{job['job_id']}/cancel", headers=HEADERS).json()
print("Cancel:", cancel)

const API_KEY = 'your_api_key';
const BASE = 'https://metavision.vip.api.efficientstack.com/api/v1';
const HEADERS = {'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json'};

async function analyzeMedia() {
const resp = await fetch(`${BASE}/analyze`, {
  method: 'POST', headers: HEADERS,
  body: JSON.stringify({
    media: 'https://example.com/video.mp4',
    media_type: 'video',
    categories: ['title', 'description'],
    detail: 'balanced',
    transcribe: true,
    transcribe_diarize: true
  })
});
const job = await resp.json();

while (true) {
  const result = await (await fetch(`${BASE}/jobs/${job.job_id}`, {headers: HEADERS})).json();
  if (result.status === 'completed') {
    console.log('Data:', result.data);
    console.log('Pipeline Info:', result.pipeline_info);
    console.log('Assets:', result.pipeline_assets);
    return result;
  }
  if (result.status === 'failed') throw new Error(result.error);
  await new Promise(r => setTimeout(r, 2000));
}
}

// Protected source with auth
async function analyzeProtected() {
const resp = await fetch(`${BASE}/analyze`, {
  method: 'POST', headers: HEADERS,
  body: JSON.stringify({
    media: 'https://protected.example.com/video/12345',
    media_type: 'video',
    auth: {
      cookie_string: 'session=abc; token=xyz',
      headers: {
        'X-API-Key': 'my-secret',
        'Referer': 'https://protected.example.com/'
      }
    }
  })
});
return resp.json();
}

// Account balance
async function getBalance() {
return (await fetch(`${BASE}/account/balance`, {headers: HEADERS})).json();
}

// Prices
async function getPrices() {
return (await fetch(`${BASE}/account/prices`, {headers: HEADERS})).json();
}

// Request history
async function getHistory(page = 1) {
return (await fetch(`${BASE}/account/requests?page=${page}`, {headers: HEADERS})).json();
}

// Request detail (includes pipeline_info and pipeline_assets)
async function getRequestDetail(id) {
return (await fetch(`${BASE}/account/requests/${id}`, {headers: HEADERS})).json();
}

// Cancel a queued job
async function cancelJob(id) {
return (await fetch(`${BASE}/account/requests/${id}/cancel`, {method:'POST', headers: HEADERS})).json();
}

analyzeMedia();

# Submit analysis
JOB=$(curl -s -X POST "https://metavision.vip.api.efficientstack.com/api/v1/analyze" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"media":"https://example.com/video.mp4","media_type":"video","categories":["title"],"detail":"balanced","transcribe":true,"transcribe_diarize":true}')

JOB_ID=$(echo $JOB | jq -r '.job_id')
while true; do
RESULT=$(curl -s "https://metavision.vip.api.efficientstack.com/api/v1/jobs/$JOB_ID" -H "Authorization: Bearer YOUR_API_KEY")
STATUS=$(echo $RESULT | jq -r '.status')
echo "Status: $STATUS"
if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then echo $RESULT | jq .; break; fi
sleep 2
done

# Submit a protected URL with cookies and custom headers
curl -X POST "https://metavision.vip.api.efficientstack.com/api/v1/analyze" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "media": "https://protected.example.com/video/12345",
  "media_type": "video",
  "categories": ["description"],
  "auth": {
    "cookie_string": "session=abc; token=xyz",
    "headers": {
      "X-API-Key": "my-secret",
      "Referer": "https://protected.example.com/"
    }
  }
}'

# View pipeline info from completed result
echo $RESULT | jq '.pipeline_info'
echo $RESULT | jq '.pipeline_assets'

# Account balance
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/balance" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Account balance with date range
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/balance?from=2024-01-01&to=2024-03-31" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Prices
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/prices" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Request history
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/requests?page=1&limit=10" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Request detail (includes pipeline_info and pipeline_assets)
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/requests/$JOB_ID" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Cancel a queued job
curl -s -X POST "https://metavision.vip.api.efficientstack.com/api/v1/account/requests/$JOB_ID/cancel" -H "Authorization: Bearer YOUR_API_KEY" | jq .