MetaVision API Documentation

Introduction

MetaVision is an advanced Multi-Modal AI analysis engine with a built-in content optimization pipeline and optional audio transcription. All media is automatically processed through the pipeline before VLM analysis, ensuring optimal quality and token efficiency.

MetaVision uses an asynchronous queue-based architecture. Submit, then poll for results.

1
Submit: POST your media (URL or base64) and categories to /api/v1/analyze. You receive a 202 Accepted with a job_id.
2
Optimize: Your media is automatically processed through the optimization pipeline (status: optimizing). Metadata including codec info, face detection, and thumbnails are extracted.
3
Transcribe: If transcribe: true, audio is transcribed with optional diarization (status: transcribing).
4
Analyze: Optimized content (and transcript) is analyzed by VLM agents (status: analyzing).
5
Receive: When the job status is completed, the response includes the full analysis results, transcript, pipeline metadata, and asset URLs.
Try the Interactive Playground

Authentication

All API requests require a Bearer token:

Authorization: Bearer YOUR_API_KEY

Submit Analysis

POST /api/v1/analyze

Media Source (one required)

ParameterTypeDescription
mediastringA publicly accessible HTTP URL pointing to media.
media_base64stringRaw base64-encoded media data. Max 2MB. Requires media_type.
media_typestringRequired. The type of media: image, video, or audio.

Optional Parameters

ParameterTypeDefaultDescription
categoriesarray[string](All)List of Category IDs to execute.
custom_categorystringnullCustom analysis instruction (max 2000 chars).
system_promptstringnullOverride system prompt. Use {{CATEGORY_PROMPT}} placeholder.
modelstringnullForce a specific Agent UUID.
detailstring"balanced"Quality profile ID controlling optimization and billing.
transcribebooleanfalseEnable audio transcription (audio/video only). Billed as flat +1× price.
transcribe_diarizebooleanfalseEnable speaker diarization in transcription.
transcribe_alignbooleanfalseEnable word-level timestamp alignment in transcription.

Submit Response (202 Accepted)

{
  "job_id": "a1b2c3d4-...",
  "status": "queued",
  "created_at": "2024-03-20T10:00:00.000Z",
  "poll_url": "/api/v1/jobs/a1b2c3d4-..."
}

Poll for Results

GET /api/v1/jobs/{job_id}

Response Statuses

StatusDescription
queuedJob is waiting in the queue. Keep polling.
optimizingMedia is being processed by the optimization pipeline. Keep polling.
transcribingAudio is being transcribed. Keep polling.
analyzingOptimized content is being analyzed by VLM agents. Keep polling.
completedAnalysis complete. Response includes meta, data, pipeline_info, and pipeline_assets.
failedJob failed. Response includes error.

Completed Response

{
  "job_id": "a1b2c3d4-...",
  "status": "completed",
  "meta": {
    "request_id": "a1b2c3d4-...",
    "model_name": "general-v2",
    "execution_time": 8.45,
    "successful_categories": 2,
    "failed_categories": 0,
    "total_tokens_in": 1500,
    "total_tokens_out": 300,
    "estimated_cost": 0.0045
  },
  "data": {
    "title": { "result": "Sunset over a mountain range" },
    "custom_category": { "result": "The mood is serene." },
    "transcript": {
      "segments": [
        { "start": 0.0, "end": 2.5, "text": "Hello world", "speaker": "SPEAKER_01" }
      ],
      "detected_language": "en",
      "language_probability": 0.997
    }
  },
  "pipeline_info": {
    "metadata": {
      "streams": [
        { "codec_type": "video", "codec_name": "h264", "width": 1280, "height": 720 },
        { "codec_type": "audio", "codec_name": "aac", "sample_rate": "44100", "channels": 2 }
      ],
      "format": { "format_name": "mov,mp4", "duration": "219.286", "size": "82989735" }
    },
    "faces": {
      "faces": [
        { "id": 1, "file": "face_001.jpg", "appearances": 3, "first_seen_time": "00:00:04" }
      ]
    },
    "download_info": {
      "title": "Video Title",
      "description": "Video description...",
      "thumbnail": "https://example.com/thumb.jpg",
      "duration": 219.0,
      "webpage_url": "https://example.com/video"
    }
  },
  "pipeline_assets": {
    "thumbnail.jpg": "https://s3.../thumbnail.jpg",
    "waveform.png": "https://s3.../waveform.png",
    "faces/face_001.jpg": "https://s3.../faces/face_001.jpg",
    "thumbnails/0001.jpg": "https://s3.../thumbnails/0001.jpg"
  }
}
Note

The transcript key only appears in data when transcribe: true was set. The pipeline_info object contains media metadata extracted by the optimization pipeline (codec info, face detection, source metadata). The pipeline_assets object contains URLs to generated assets like thumbnails, waveform, and face crops.

Account Balance

Retrieve your usage statistics grouped by time period.

GET /api/v1/account/balance

Optional Query Parameters

ParameterTypeDescription
fromstringStart date for custom range (YYYY-MM-DD)
tostringEnd date for custom range (YYYY-MM-DD)

Response

{
  "api_key_prefix": "e59e5c0d-...",
  "description": "Production App",
  "overall": {
    "total_requests": 1250,
    "total_success": 1200,
    "total_failed": 50,
    "total_billed": 12.50
  },
  "today": { ... },
  "week": { ... },
  "month": { ... },
  "breakdown_by_type": [
    { "media_type": "image", "request_count": 800, "success_count": 790, "failed_count": 10, "total_billed": 4.00 },
    { "media_type": "video", "request_count": 450, "success_count": 410, "failed_count": 40, "total_billed": 8.50 }
  ],
  "custom_range": null
}

When from and to are provided, the custom_range field contains the aggregated stats for that date range.

Pricing

Retrieve your configured per-category prices and available quality profiles.

GET /api/v1/account/prices

Response

{
  "price_per_image": 0.005,
  "price_per_video": 0.01,
  "price_per_audio": 0.008,
  "quality_profiles": [
    { "id": "low", "name": "Low", "billing_multiplier": 1 },
    { "id": "balanced", "name": "Balanced", "billing_multiplier": 1 },
    { "id": "high", "name": "High", "billing_multiplier": 2 },
    { "id": "ultra", "name": "Ultra", "billing_multiplier": 4 }
  ],
  "billing_note": "Each category is billed at price_per_type × quality_multiplier. Transcription adds a flat 1× price_per_type regardless of quality level."
}

Request History

List your past analysis requests with pagination and filters. You can also fetch full results or cancel queued jobs.

List Requests

GET /api/v1/account/requests

Query Parameters

ParameterTypeDefaultDescription
pageinteger1Page number
limitinteger50Results per page (max 100)
statusstringallFilter: success, partial, failed
media_typestringallFilter: image, video, audio
fromstring-Start date (YYYY-MM-DD)
tostring-End date (YYYY-MM-DD)

Response

{
  "requests": [
    {
      "request_id": "a1b2c3d4-...",
      "status": "success",
      "media_type": "video",
      "categories": "[\"title\",\"description\"]",
      "quality_profile": "balanced",
      "billed_amount": 0.02,
      "execution_time": 8.4,
      "successful_categories": 2,
      "failed_categories": 0,
      "tokens_in": 1500,
      "tokens_out": 300,
      "agent_name": "general-v3",
      "created_at": "2024-03-20T10:00:00.000Z"
    }
  ],
  "total": 156,
  "page": 1,
  "limit": 50
}

Get Request Detail

GET /api/v1/account/requests/{request_id}

Returns the full analysis result including all category data, pipeline info, and assets. Checks active jobs first (KV), then falls back to the log database (D1) for completed historical requests.

Response (Completed)

{
  "request_id": "a1b2c3d4-...",
  "status": "completed",
  "media_type": "video",
  "agent_name": "general-v3",
  "categories": ["title", "description"],
  "quality_profile": "balanced",
  "created_at": "...",
  "meta": {
    "execution_time": 8.45,
    "tokens_in": 1500,
    "tokens_out": 300,
    "billed_amount": 0.02,
    "successful_categories": 2,
    "failed_categories": 0
  },
  "data": {
    "title": { "result": "..." },
    "description": { "result": "..." }
  },
  "pipeline_info": { ... },
  "pipeline_assets": {
    "thumbnail.jpg": "https://s3.../thumbnail.jpg",
    "waveform.png": "https://s3.../waveform.png"
  }
}

Cancel a Queued Job

POST /api/v1/account/requests/{request_id}/cancel

Cancel a job that is still in queued status. Jobs that have already started processing cannot be cancelled.

Response

{
  "success": true,
  "job_id": "a1b2c3d4-...",
  "status": "cancelled"
}

Transcription

MetaVision includes an integrated speech transcription pipeline for audio and video content. When enabled, the transcript is automatically provided to VLM agents for richer context, and included in the response.

Features

How it works

Set transcribe: true on your request. The pipeline runs after media optimization and before VLM analysis. The transcript text is included in the VLM prompt so the model has both the media and the spoken content.

Billing

Transcription is billed as a flat 1× price_per_type per request, regardless of the quality profile multiplier. For example, if you request 3 categories with a 2× quality profile plus transcription, billing = (3 × price × 2) + (1 × price).

Parameters

ParameterTypeDefaultDescription
transcribebooleanfalseEnable transcription. Only for audio and video.
transcribe_diarizebooleanfalseAssign speaker IDs to segments.
transcribe_alignbooleanfalseEnable word-level timestamp alignment.

Transcript Response Format

{
  "transcript": {
    "segments": [
      {
        "start": 0.0,
        "end": 2.5,
        "text": "Hello, how are you?",
        "speaker": "SPEAKER_01",
        "words": [
          {"word": "Hello,", "start": 0.1, "end": 0.5, "speaker": "SPEAKER_01"},
          {"word": "how", "start": 0.6, "end": 0.9, "speaker": "SPEAKER_01"}
        ]
      }
    ],
    "detected_language": "en",
    "language_probability": 0.997,
    "speakers": {
      "SPEAKER_01": {"name": "SPEAKER_01", "time": 2.5}
    }
  }
}
Note

The words array is only present when transcribe_align: true. The speaker field and speakers object are only present when transcribe_diarize: true.

Pipeline Info & Assets

When media is processed through the optimization pipeline, the completed response includes two additional fields:

pipeline_info

A JSON object containing rich metadata extracted during optimization. The structure varies by media type and source, but commonly includes:

Note

Not all fields are present for every request. Image analysis won't have video streams or faces. Direct URL uploads may not have download_info. Always check for the existence of fields before accessing them.

pipeline_assets

A map of filename → URL for generated assets. Common assets include:

Key PatternDescription
thumbnail.jpgMain thumbnail image
thumbnails/0001.jpgScene thumbnails extracted from video
waveform.pngAudio waveform visualization
faces/face_001.jpgCropped face images from face detection

Quality Profiles

The detail parameter selects a quality profile that controls how media is optimized and how billing is calculated.

Loading...
Billing

Each category counts as billing_multiplier × price_per_category. Transcription is always billed at a flat 1× price regardless of the quality profile.

Analysis Categories

Loading...

Available Models

Loading...

Error Handling

StatusCodeDescription
400invalid_requestMalformed JSON, missing media, invalid URL or base64, invalid model UUID, invalid quality profile, or transcription requested for image media.
400invalid_categoriesRequested categories do not exist or are disabled.
401invalid_api_keyMissing or incorrect Authorization header.
403account_disabledAPI key disabled.
404not_foundJob not found or expired.
500internal_errorServer, pipeline, or transcription configuration error.

Code Examples

import requests, time, base64

API_KEY = "your_api_key"
BASE = "https://metavision.vip.api.efficientstack.com/api/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

# URL-based submission with transcription
payload = {
    "media": "https://example.com/video.mp4",
    "media_type": "video",
    "categories": ["title", "description"],
    "detail": "balanced",
    "transcribe": True,
    "transcribe_diarize": True,
    "transcribe_align": False
}

resp = requests.post(f"{BASE}/analyze", json=payload, headers=HEADERS)
job = resp.json()
print(f"Job: {job['job_id']}")

while True:
    result = requests.get(f"{BASE}/jobs/{job['job_id']}", headers=HEADERS).json()
    print(f"Status: {result['status']}")
    if result["status"] == "completed":
        print("Results:", result["data"])
        if "transcript" in result["data"]:
            print("Transcript:", result["data"]["transcript"])
        if result.get("pipeline_info"):
            print("Pipeline Info:", result["pipeline_info"])
        if result.get("pipeline_assets"):
            print("Assets:", result["pipeline_assets"])
        break
    elif result["status"] == "failed":
        print("Error:", result.get("error"))
        break
    time.sleep(2)

# Check account balance
balance = requests.get(f"{BASE}/account/balance", headers=HEADERS).json()
print(f"Total billed: ${balance['overall']['total_billed']}")

# Check prices
prices = requests.get(f"{BASE}/account/prices", headers=HEADERS).json()
print(f"Image price: ${prices['price_per_image']}/category")

# Browse request history
history = requests.get(f"{BASE}/account/requests?limit=10", headers=HEADERS).json()
for req in history["requests"]:
    print(f"{req['request_id'][:12]}... {req['status']} {req['media_type']}")

# Get full result for a past request (includes pipeline_info and pipeline_assets)
detail = requests.get(f"{BASE}/account/requests/{job['job_id']}", headers=HEADERS).json()
print("Full result:", detail)

# Cancel a queued job
cancel = requests.post(f"{BASE}/account/requests/{job['job_id']}/cancel", headers=HEADERS).json()
print("Cancel:", cancel)
const API_KEY = 'your_api_key';
const BASE = 'https://metavision.vip.api.efficientstack.com/api/v1';
const HEADERS = {'Authorization': `Bearer ${API_KEY}`, 'Content-Type': 'application/json'};

async function analyzeMedia() {
  const resp = await fetch(`${BASE}/analyze`, {
    method: 'POST', headers: HEADERS,
    body: JSON.stringify({
      media: 'https://example.com/video.mp4',
      media_type: 'video',
      categories: ['title', 'description'],
      detail: 'balanced',
      transcribe: true,
      transcribe_diarize: true
    })
  });
  const job = await resp.json();

  while (true) {
    const result = await (await fetch(`${BASE}/jobs/${job.job_id}`, {headers: HEADERS})).json();
    if (result.status === 'completed') {
      console.log('Data:', result.data);
      console.log('Pipeline Info:', result.pipeline_info);
      console.log('Assets:', result.pipeline_assets);
      return result;
    }
    if (result.status === 'failed') throw new Error(result.error);
    await new Promise(r => setTimeout(r, 2000));
  }
}

// Account balance
async function getBalance() {
  return (await fetch(`${BASE}/account/balance`, {headers: HEADERS})).json();
}

// Prices
async function getPrices() {
  return (await fetch(`${BASE}/account/prices`, {headers: HEADERS})).json();
}

// Request history
async function getHistory(page = 1) {
  return (await fetch(`${BASE}/account/requests?page=${page}`, {headers: HEADERS})).json();
}

// Request detail (includes pipeline_info and pipeline_assets)
async function getRequestDetail(id) {
  return (await fetch(`${BASE}/account/requests/${id}`, {headers: HEADERS})).json();
}

// Cancel a queued job
async function cancelJob(id) {
  return (await fetch(`${BASE}/account/requests/${id}/cancel`, {method:'POST', headers: HEADERS})).json();
}

analyzeMedia();
# Submit analysis
JOB=$(curl -s -X POST "https://metavision.vip.api.efficientstack.com/api/v1/analyze" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"media":"https://example.com/video.mp4","media_type":"video","categories":["title"],"detail":"balanced","transcribe":true,"transcribe_diarize":true}')

JOB_ID=$(echo $JOB | jq -r '.job_id')
while true; do
  RESULT=$(curl -s "https://metavision.vip.api.efficientstack.com/api/v1/jobs/$JOB_ID" -H "Authorization: Bearer YOUR_API_KEY")
  STATUS=$(echo $RESULT | jq -r '.status')
  echo "Status: $STATUS"
  if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then echo $RESULT | jq .; break; fi
  sleep 2
done

# View pipeline info from completed result
echo $RESULT | jq '.pipeline_info'
echo $RESULT | jq '.pipeline_assets'

# Account balance
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/balance" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Account balance with date range
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/balance?from=2024-01-01&to=2024-03-31" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Prices
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/prices" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Request history
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/requests?page=1&limit=10" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Request detail (includes pipeline_info and pipeline_assets)
curl -s "https://metavision.vip.api.efficientstack.com/api/v1/account/requests/$JOB_ID" -H "Authorization: Bearer YOUR_API_KEY" | jq .

# Cancel a queued job
curl -s -X POST "https://metavision.vip.api.efficientstack.com/api/v1/account/requests/$JOB_ID/cancel" -H "Authorization: Bearer YOUR_API_KEY" | jq .