MetaVision API Documentation

Introduction

MetaVision is an advanced Multi-Modal AI analysis engine with a built-in content optimization pipeline and optional audio transcription. All media is automatically processed through the pipeline before VLM analysis, ensuring optimal quality and token efficiency.

MetaVision uses an asynchronous queue-based architecture. Submit, then poll for results.

1
Submit: POST your media (URL or base64) and categories to /api/v1/analyze. You receive a 202 Accepted with a job_id.
2
Optimize: Your media is automatically processed through the optimization pipeline (status: optimizing). Metadata including codec info, face detection, and thumbnails are extracted.
3
Transcribe: If transcribe: true, audio is transcribed with optional diarization (status: transcribing).
4
Analyze: Optimized content (and transcript) is analyzed by VLM agents (status: analyzing).
5
Receive: When the job status is completed, the response includes the full analysis results, transcript, pipeline metadata, and asset URLs.
Try the Interactive Playground

Authentication

All API requests require a Bearer token:

Authorization: Bearer YOUR_API_KEY

Submit Analysis

POST /api/v1/analyze

Media Source (one required)

ParameterTypeDescription
mediastringA publicly accessible HTTP URL pointing to media.
media_base64stringRaw base64-encoded media data. Max 2MB. Requires media_type.
media_typestringRequired. The type of media: image, video, or audio.

Optional Parameters

ParameterTypeDefaultDescription
categoriesarray[string](All)List of Category IDs to execute.
custom_categorystringnullCustom analysis instruction (max 5000 chars).
system_promptstringnullOverride system prompt. Use {{CATEGORY_PROMPT}} placeholder.
modelstringnullForce a specific Agent UUID.
detailstring"balanced"Quality profile ID controlling optimization and billing.
transcribebooleanfalseEnable audio transcription (audio/video only). Billed as flat +1× price.
transcribe_diarizebooleanfalseEnable speaker diarization in transcription.
transcribe_alignbooleanfalseEnable word-level timestamp alignment in transcription.
authobjectnullCredentials for fetching protected media URLs. Only valid with media (URL) and when the optimization pipeline is enabled. See Authenticated Sources.

Submit Response (202 Accepted)

{
"job_id": "a1b2c3d4-...",
"status": "queued",
"created_at": "2024-03-20T10:00:00.000Z",
"poll_url": "/api/v1/jobs/a1b2c3d4-..."
}

Poll for Results

GET /api/v1/jobs/{job_id}

Response Statuses

StatusDescription
queuedJob is waiting in the queue. Keep polling.
optimizingMedia is being processed by the optimization pipeline. Keep polling.
transcribingAudio is being transcribed. Keep polling.
analyzingOptimized content is being analyzed by VLM agents. Keep polling.
completedAnalysis complete. Response includes meta, data, pipeline_info, and pipeline_assets.
failedJob failed. Response includes error.

Completed Response

{
"job_id": "a1b2c3d4-...",
"status": "completed",
"meta": {
  "request_id": "a1b2c3d4-...",
  "model_name": "general-v2",
  "execution_time": 8.45,
  "successful_categories": 2,
  "failed_categories": 0,
  "total_tokens_in": 1500,
  "total_tokens_out": 300,
  "estimated_cost": 0.0045
},
"data": {
  "title": { "result": "Sunset over a mountain range" },
  "custom_category": { "result": "The mood is serene." },
  "transcript": {
    "segments": [
      { "start": 0.0, "end": 2.5, "text": "Hello world", "speaker": "SPEAKER_01" }
    ],
    "detected_language": "en",
    "language_probability": 0.997
  }
},
"pipeline_info": {
  "metadata": {
    "streams": [
      { "codec_type": "video", "codec_name": "h264", "width": 1280, "height": 720 },
      { "codec_type": "audio", "codec_name": "aac", "sample_rate": "44100", "channels": 2 }
    ],
    "format": { "format_name": "mov,mp4", "duration": "219.286", "size": "82989735" }
  },
  "faces": {
    "faces": [
      { "id": 1, "file": "face_001.jpg", "appearances": 3, "first_seen_time": "00:00:04" }
    ]
  },
  "download_info": {
    "title": "Video Title",
    "description": "Video description...",
    "thumbnail": "https://example.com/thumb.jpg",
    "duration": 219.0,
    "webpage_url": "https://example.com/video"
  }
},
"pipeline_assets": {
  "thumbnail.jpg": "https://s3.../thumbnail.jpg",
  "waveform.png": "https://s3.../waveform.png",
  "faces/face_001.jpg": "https://s3.../faces/face_001.jpg",
  "thumbnails/0001.jpg": "https://s3.../thumbnails/0001.jpg"
}
}
Note

The transcript key only appears in data when transcribe: true was set. The pipeline_info object contains media metadata extracted by the optimization pipeline (codec info, face detection, source metadata). The pipeline_assets object contains URLs to generated assets like thumbnails, waveform, and face crops.

Account Balance

Retrieve your usage statistics grouped by time period.

GET /api/v1/account/balance

Optional Query Parameters

ParameterTypeDescription
fromstringStart date for custom range (YYYY-MM-DD)
tostringEnd date for custom range (YYYY-MM-DD)

Response

{
"api_key_prefix": "e59e5c0d-...",
"description": "Production App",
"overall": {
  "total_requests": 1250,
  "total_success": 1200,
  "total_failed": 50,
  "total_billed": 12.50
},
"today": { ... },
"week": { ... },
"month": { ... },
"breakdown_by_type": [
  { "media_type": "image", "request_count": 800, "success_count": 790, "failed_count": 10, "total_billed": 4.00 },
  { "media_type": "video", "request_count": 450, "success_count": 410, "failed_count": 40, "total_billed": 8.50 }
],
"custom_range": null
}

When from and to are provided, the custom_range field contains the aggregated stats for that date range.

Pricing

Retrieve your configured per-category prices and available quality profiles.

GET /api/v1/account/prices

Response

{
"price_per_image": 0.005,
"price_per_video": 0.01,
"price_per_audio": 0.008,
"quality_profiles": [
  { "id": "low", "name": "Low", "billing_multiplier": 1 },
  { "id": "balanced", "name": "Balanced", "billing_multiplier": 1 },
  { "id": "high", "name": "High", "billing_multiplier": 2 },
  { "id": "ultra", "name": "Ultra", "billing_multiplier": 4 }
],
"billing_note": "Each category is billed at price_per_type × quality_multiplier. Transcription adds a flat 1× price_per_type regardless of quality level."
}

Request History

List your past analysis requests with pagination and filters. You can also fetch full results or cancel queued jobs.

List Requests

GET /api/v1/account/requests

Query Parameters

ParameterTypeDefaultDescription
pageinteger1Page number
limitinteger50Results per page (max 100)
statusstringallFilter: success, partial, failed
media_typestringallFilter: image, video, audio
fromstring-Start date (YYYY-MM-DD)
tostring-End date (YYYY-MM-DD)

Response

{
"requests": [
  {
    "request_id": "a1b2c3d4-...",
    "status": "success",
    "media_type": "video",
    "categories": "[\"title\",\"description\"]",
    "quality_profile": "balanced",
    "billed_amount": 0.02,
    "execution_time": 8.4,
    "successful_categories": 2,
    "failed_categories": 0,
    "tokens_in": 1500,
    "tokens_out": 300,
    "agent_name": "general-v3",
    "created_at": "2024-03-20T10:00:00.000Z"
  }
],
"total": 156,
"page": 1,
"limit": 50
}

Get Request Detail

GET /api/v1/account/requests/{request_id}

Returns the full analysis result including all category data, pipeline info, and assets. Checks active jobs first (KV), then falls back to the log database (D1) for completed historical requests.

Response (Completed)

{
"request_id": "a1b2c3d4-...",
"status": "completed",
"media_type": "video",
"agent_name": "general-v3",
"categories": ["title", "description"],
"quality_profile": "balanced",
"created_at": "...",
"meta": {
  "execution_time": 8.45,
  "tokens_in": 1500,
  "tokens_out": 300,
  "billed_amount": 0.02,
  "successful_categories": 2,
  "failed_categories": 0
},
"data": {
  "title": { "result": "..." },
  "description": { "result": "..." }
},
"pipeline_info": { ... },
"pipeline_assets": {
  "thumbnail.jpg": "https://s3.../thumbnail.jpg",
  "waveform.png": "https://s3.../waveform.png"
}
}

Cancel a Queued Job

POST /api/v1/account/requests/{request_id}/cancel

Cancel a job that is still in queued status. Jobs that have already started processing cannot be cancelled.

Response

{
"success": true,
"job_id": "a1b2c3d4-...",
"status": "cancelled"
}

Authenticated Sources

When your media URL requires authentication (private YouTube content, paywalled videos, CDN-protected assets, S3 links that require custom headers, etc.) you can attach credentials via the optional auth object. These are forwarded to the optimization pipeline which performs the actual download.

Important

The auth field is only valid when used together with media (URL). It is rejected if media_base64 is used. It also requires the optimization pipeline to be enabled on the server. Credentials are stored encrypted in memory with a short TTL and are never written to request logs or admin-visible records.

Structure

{
"media": "https://protected.example.com/video/12345",
"media_type": "video",
"auth": {
  "cookies_txt": "# Netscape HTTP Cookie File\n.example.com\tTRUE\t/\tTRUE\t0\tsession\tabc123\n...",
  "cookie_string": "session=abc123; token=xyz789",
  "headers": {
    "X-API-Key": "my-secret",
    "Accept-Language": "en-US",
    "Referer": "https://protected.example.com/"
  }
}
}

Fields

FieldTypeMax SizeDescription
auth.cookies_txtstring64 KBFull contents of a Netscape-format cookies.txt file. Highest priority when multiple auth fields are set.
auth.cookie_stringstring8 KBRaw Cookie header value (e.g. session=abc; token=xyz). Used only if cookies_txt is not provided.
auth.headersobject20 entries, value ≤ 4 KBMap of extra HTTP headers sent with the download request. Can override User-Agent and Referer. Header names must be valid RFC 7230 tokens; values must not contain CR, LF, or NUL characters.

Validation errors

Transcription

MetaVision includes an integrated speech transcription pipeline for audio and video content. When enabled, the transcript is automatically provided to VLM agents for richer context, and included in the response.

Features

How it works

Set transcribe: true on your request. The pipeline runs after media optimization and before VLM analysis. The transcript text is included in the VLM prompt so the model has both the media and the spoken content.

Billing

Transcription is billed as a flat 1× price_per_type per request, regardless of the quality profile multiplier. For example, if you request 3 categories with a 2× quality profile plus transcription, billing = (3 × price × 2) + (1 × price).

Parameters

ParameterTypeDefaultDescription
transcribebooleanfalseEnable transcription. Only for audio and video.
transcribe_diarizebooleanfalseAssign speaker IDs to segments.
transcribe_alignbooleanfalseEnable word-level timestamp alignment.

Transcript Response Format

{
"transcript": {
  "segments": [
    {
      "start": 0.0,
      "end": 2.5,
      "text": "Hello, how are you?",
      "speaker": "SPEAKER_01",
      "words": [
        {"word": "Hello,", "start": 0.1, "end": 0.5, "speaker": "SPEAKER_01"},
        {"word": "how", "start": 0.6, "end": 0.9, "speaker": "SPEAKER_01"}
      ]
    }
  ],
  "detected_language": "en",
  "language_probability": 0.997,
  "speakers": {
    "SPEAKER_01": {"name": "SPEAKER_01", "time": 2.5}
  }
}
}
Note

The words array is only present when transcribe_align: true. The speaker field and speakers object are only present when transcribe_diarize: true.

Pipeline Info & Assets

When media is processed through the optimization pipeline, the completed response includes two additional fields:

pipeline_info

A JSON object containing rich metadata extracted during optimization. The structure varies by media type and source, but commonly includes:

Note

Not all fields are present for every request. Image analysis won't have video streams or faces. Direct URL uploads may not have download_info. Always check for the existence of fields before accessing them.

pipeline_assets

A map of filename → URL for generated assets. Common assets include:

Key PatternDescription
thumbnail.jpgMain thumbnail image
thumbnails/0001.jpgScene thumbnails extracted from video
waveform.pngAudio waveform visualization
faces/face_001.jpgCropped face images from face detection

Quality Profiles

The detail parameter selects a quality profile that controls how media is optimized and how billing is calculated.

Loading...
Billing

Each category counts as billing_multiplier × price_per_category. Transcription is always billed at a flat 1× price regardless of the quality profile.

Analysis Categories

Loading...

Available Models

Loading...

Error Handling

StatusCodeDescription
400invalid_requestMalformed JSON, missing media, invalid URL or base64, invalid model UUID, invalid quality profile, invalid auth fields, or transcription requested for image media.
400invalid_categoriesRequested categories do not exist or are disabled.
401invalid_api_keyMissing or incorrect Authorization header.
403account_disabledAPI key disabled.
404not_foundJob not found or expired.
413invalid_requestRequest body exceeds maximum size (5 MB).
429rate_limitedRate limit exceeded. Check Retry-After header.
500internal_errorServer, pipeline, or transcription configuration error.

Code Examples

import requests, time, base64

API_KEY = "your_api_key"
BASE = "https://metavision.vip.api.efficientstack.com/api/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

# URL-based submission with transcription
payload = {
  "media": "https://example.com/video.mp4",
  "media_type": "video",
  "categories": ["title", "description"],
  "detail": "balanced",
  "transcribe": True,
  "transcribe_diarize": True,
  "transcribe_align": False
}

resp = requests.post(f"{BASE}/analyze", json=payload, headers=HEADERS)
job = resp.json()
print(f"Job: {job['job_id']}")

while True:
  result = requests.get(f"{BASE}/jobs/{job['job_id']}", headers=HEADERS).json()
  print(f"Status: {result['status']}")
  if result["status"] == "completed":
      print("Results:", result["data"])
      if "transcript" in result["data"]:
          print("Transcript:", result["data"]["transcript"])
      if result.get("pipeline_info"):
          print("Pipeline Info:", result["pipeline_info"])
      if result.get("pipeline_assets"):
          print("Assets:", result["pipeline_assets"])
      break
  elif result["status"] == "failed":
      print("Error:", result.get("error"))
      break
  time.sleep(2)

# Protected source with cookies and custom headers
protected_payload = {
  "media": "https://protected.example.com/video/12345",
  "media_type": "video",
  "categories": ["description"],
  "auth": {
      "cookie_string": "session=abc123; auth_token=xyz",
      "headers": {
          "X-API-Key": "my-api-key",
          "Referer": "https://protected.example.com/"
      }
  }
}
requests.post(f"{BASE}/analyze", json=protected_payload, headers=HEADERS)

# Protected source with a cookies.txt file (Netscape format)
with open("cookies.txt") as f:
  cookies_content = f.read()
requests.post(f"{BASE}/analyze", json={
  "media": "https://youtube.com/watch?v=PRIVATE",
  "media_type": "video",
  "auth": {"cookies_txt": cookies_content}
}, headers=HEADERS)

# Check account balance
balance = requests.get(f"{BASE}/account/balance", headers=HEADERS).json()
print(f"Total billed: ${balance['overall']['total_billed']}")

# Check prices
prices = requests.get(f"{BASE}/account/prices", headers=HEADERS).json()
print(f"Image price: ${prices['price_per_image']}/category")

# Browse request history
history = requests.get(f"{BASE}/account/requests?limit=10", headers=HEADERS).json()
for req in history["requests"]:
  print(f"{req['request_id'][:12]}... {req['status']} {req['media_type']}")

# Get full result for a past request (includes pipeline_info and pipeline_assets)
detail = requests.get(f"{BASE}/account/requests/{job['job_id']}", headers=HEADERS).json()
print("Full result:", detail)

# Cancel a queued job
cancel = requests.post(f"{BASE}/account/requests/{job['job_id']}/cancel", headers=HEADERS).json()
print("Cancel:", cancel)
const API_KEY = 'your_api_key';
const BASE = 'https://metavision.vip.api.efficientstack.com/api/v1';
const HEADERS = {'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json'};

async function analyzeMedia() {
const resp = await fetch(`${BASE}/analyze`, {
  method: 'POST', headers: HEADERS,
  body: JSON.stringify({
    media: 'https://example.com/video.mp4',
    media_type: 'video',
    categories: ['title', 'description'],
    detail: 'balanced',
    transcribe: true,
    transcribe_diarize: true
  })
});
const job = await resp.json();

while (true) {
  const result = await (await fetch(`${BASE}/jobs/${job.job_id}`, {headers: HEADERS})).json();
  if (result.status === 'completed') {
    console.log('Data:', result.data);
    console.log('Pipeline Info:', result.pipeline_info);
    console.log('Assets:', result.pipeline_assets);
    return result;
  }
  if (result.status === 'failed') throw new Error(result.error);
  await new Promise(r => setTimeout(r, 2000));
}
}

// Protected source with auth
async function analyzeProtected() {
const resp = await fetch(`${BASE}/analyze`, {
  method: 'POST', headers: HEADERS,
  body: JSON.stringify({
    media: 'https://protected.example.com/video/12345',
    media_type: 'video',
    auth: {
      cookie_string: 'session=abc; token=xyz',
      headers: {
        'X-API-Key': 'my-secret',
        'Referer': 'https://protected.example.com/'
      }
    }
  })
});
return resp.json();
}

// Account balance
async function getBalance() {
return (await fetch(`${BASE}/account/balance`, {headers: HEADERS})).json();
}

// Prices
async function getPrices() {
return (await fetch(`${BASE}/account/prices`, {headers: HEADERS})).json();
}

// Request history
async function getHistory(page = 1) {
return (await fetch(`${BASE}/account/requests?page=${page}`, {headers: HEADERS})).json();
}

// Request detail (includes pipeline_info and pipeline_assets)
async function getRequestDetail(id) {
return (await fetch(`${BASE}/account/requests/${id}`, {headers: HEADERS})).json();
}

// Cancel a queued job
async function cancelJob(id) {
return (await fetch(`${BASE}/account/requests/${id}/cancel`, {method:'POST', headers: HEADERS})).json();
}

analyzeMedia();
# Submit analysis
JOB=$(curl -s -X POST "https://metavision.vip.api.efficientstack.com/api/v1/analyze" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"media":"https://example.com/video.mp4","media_type":"video","categories":["title"],"detail":"balanced","transcribe":true,"transcribe_diarize":true}')

JOB_ID=$(echo $JOB | jq -r '.job_id')
while true; do
RESULT=$(curl -s "https://metavision.vip.api.efficientstack.com/api/v1/jobs/$JOB_ID" -H "Authorization: Bearer YOUR_API_KEY")
STATUS=$(echo $RESULT | jq -r '.status')
echo "Status: $STATUS"
if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then echo $RESULT | jq .; break; fi
sleep 2
done

# Submit a protected URL with cookies and custom headers
curl -X POST "https://metavision.vip.api.efficientstack.com/api/v1/analyze" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
  "media": "https://protected.example.com/video/12345",
  "media_type": "video",
  "categories": ["description"],
  "auth": {
    "cookie_string": "session=abc; token=xyz",
    "headers": {
      "X-API-Key": "my-secret",
      "Referer": "https://protected.example.com/"
    }
  }
}'

# View pipeline info from completed result
echo $RESULT | jq '.pipeline_info'
echo $RESULT | jq '.pipeline_assets'

# Account balance
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/balance" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Account balance with date range
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/balance?from=2024-01-01&to=2024-03-31" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Prices
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/prices" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Request history
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/requests?page=1&limit=10" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Request detail (includes pipeline_info and pipeline_assets)
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/requests/$JOB_ID" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Cancel a queued job
curl -s -X POST "https://metavision.vip.api.efficientstack.com/api/v1/account/requests/$JOB_ID/cancel" -H "Authorization: Bearer YOUR_API_KEY" | jq .