# MetaVision API — Reference Documentation

> Base URL: `https://metavision.vip.api.efficientstack.com`

## Overview

MetaVision is an asynchronous multi-modal AI media analysis API. Media (images, video, audio) is submitted for analysis, processed through an optimization pipeline, optionally transcribed, and analyzed by Vision Language Models (VLMs).

### Architecture Flow

1. **Submit** `POST /api/v1/analyze` → receive `job_id` (HTTP 202)
2. **Poll** `GET /api/v1/jobs/{job_id}` → status updates
3. **Status progression:** `queued` → `optimizing` → `transcribing` (if enabled) → `analyzing` → `completed` / `failed`
4. **Completed** response includes analysis results, optional transcript, pipeline metadata, and asset URLs

## Authentication

All authenticated endpoints require a Bearer token:

```
Authorization: Bearer YOUR_API_KEY
```

---

## Public Endpoints (No Auth Required)

### GET /api/v1/public/categories

Returns available analysis categories.

**Response:** `200 OK`
```json
[
  { "id": "title", "name": "Generate Title", "output_key": "title" },
  { "id": "description", "name": "Long Description", "output_key": "description" }
]
```

### GET /api/v1/public/models

Returns available VLM agents.

**Response:** `200 OK`
```json
[
  { "id": "uuid-string", "name": "general-v3", "capabilities": ["image", "video", "audio"] }
]
```

### GET /api/v1/public/quality-profiles

Returns available quality profiles with billing multipliers.

**Response:** `200 OK`
```json
[
  { "id": "balanced", "name": "Balanced", "description": "Standard analysis.", "billing_multiplier": 1 },
  { "id": "high", "name": "High", "description": "Detailed analysis.", "billing_multiplier": 2 }
]
```

---

## Analysis Endpoints

### POST /api/v1/analyze

Submit media for analysis.

**Request Body:**

| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `media` | string | One of media/media_base64 | — | Public HTTP URL to media file |
| `media_base64` | string | One of media/media_base64 | — | Raw base64-encoded media data (max 2MB) |
| `media_type` | string | **Yes** | — | `"image"`, `"video"`, or `"audio"` |
| `categories` | string[] | No | All enabled | Array of category IDs to execute |
| `custom_category` | string | No | null | Custom analysis prompt (max 2000 chars) |
| `system_prompt` | string | No | null | Override system prompt. Use `{{CATEGORY_PROMPT}}` placeholder |
| `model` | string | No | null | Force a specific Agent UUID |
| `detail` | string | No | `"balanced"` | Quality profile ID |
| `transcribe` | boolean | No | false | Enable audio transcription (audio/video only) |
| `transcribe_diarize` | boolean | No | false | Enable speaker diarization |
| `transcribe_align` | boolean | No | false | Enable word-level timestamp alignment |

**Response:** `202 Accepted`
```json
{
  "job_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "queued",
  "created_at": "2024-03-20T10:00:00.000Z",
  "poll_url": "/api/v1/jobs/a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
```

### GET /api/v1/jobs/{job_id}

Poll for job status and results. Recommended poll interval: 2 seconds.

**Response Statuses:**

| Status | Description |
|--------|-------------|
| `queued` | Waiting in queue. Keep polling. |
| `optimizing` | Media processing through optimization pipeline. Keep polling. |
| `transcribing` | Audio being transcribed. Keep polling. |
| `analyzing` | VLM analysis in progress. Keep polling. |
| `completed` | Done. Full results included. |
| `failed` | Failed. Error message included. |

**Completed Response:** `200 OK`
```json
{
  "job_id": "a1b2c3d4-...",
  "status": "completed",
  "created_at": "2024-03-20T10:00:00.000Z",
  "completed_at": "2024-03-20T10:00:12.000Z",
  "meta": {
    "request_id": "a1b2c3d4-...",
    "model_name": "general-v3",
    "execution_time": 8.45,
    "successful_categories": 2,
    "failed_categories": 0,
    "total_tokens_in": 1500,
    "total_tokens_out": 300,
    "estimated_cost": 0.02
  },
  "data": {
    "title": { "result": "Sunset over a mountain range" },
    "description": { "result": "A detailed description of the content..." },
    "transcript": {
      "segments": [
        { "start": 0.0, "end": 2.5, "text": "Hello world", "speaker": "SPEAKER_01" }
      ],
      "detected_language": "en",
      "language_probability": 0.997
    }
  },
  "pipeline_info": {
    "metadata": {
      "streams": [
        { "codec_type": "video", "codec_name": "h264", "width": 1280, "height": 720, "duration": "219.28", "r_frame_rate": "30000/1001" },
        { "codec_type": "audio", "codec_name": "aac", "sample_rate": "44100", "channels": 2 }
      ],
      "format": {
        "format_name": "mov,mp4,m4a,3gp,3g2,mj2",
        "duration": "219.286000",
        "size": "82989735",
        "bit_rate": "3027634"
      }
    },
    "faces": {
      "faces": [
        { "id": 1, "file": "face_001.jpg", "appearances": 3, "first_seen_time": "00:00:04", "last_seen_time": "00:01:30" }
      ]
    },
    "download_info": {
      "title": "Video Title",
      "description": "Video description...",
      "thumbnail": "https://example.com/thumb.jpg",
      "duration": 219.0,
      "webpage_url": "https://example.com/video"
    }
  },
  "pipeline_assets": {
    "thumbnail.jpg": "https://s3.../thumbnail.jpg",
    "waveform.png": "https://s3.../waveform.png",
    "faces/face_001.jpg": "https://s3.../faces/face_001.jpg",
    "thumbnails/0001.jpg": "https://s3.../thumbnails/0001.jpg",
    "thumbnails/0002.jpg": "https://s3.../thumbnails/0002.jpg"
  }
}
```

**Notes:**
- `data.transcript` only present when `transcribe: true` was set
- `pipeline_info` contains media metadata from the optimization pipeline (may be empty for some inputs)
- `pipeline_assets` contains URLs to generated assets (thumbnails, waveform, face crops)
- `meta.estimated_cost` is the amount billed to your account

**Failed Response:**
```json
{
  "job_id": "...",
  "status": "failed",
  "error": "Optimization failed: Pipeline processing timed out"
}
```

**In-Progress Response:**
```json
{
  "job_id": "...",
  "status": "optimizing",
  "created_at": "...",
  "started_at": "...",
  "poll_url": "/api/v1/jobs/..."
}
```

---

## Account Endpoints

### GET /api/v1/account/balance

Get usage statistics for your API key.

**Query Parameters:**

| Parameter | Type | Description |
|-----------|------|-------------|
| `from` | string | Start date for custom range (YYYY-MM-DD) |
| `to` | string | End date for custom range (YYYY-MM-DD) |

**Response:**
```json
{
  "api_key_prefix": "e59e5c0d-...",
  "description": "Production App",
  "overall": { "total_requests": 1250, "total_success": 1200, "total_failed": 50, "total_billed": 12.50 },
  "today": { "total_requests": 45, "total_success": 44, "total_failed": 1, "total_billed": 0.45 },
  "week": { ... },
  "month": { ... },
  "breakdown_by_type": [
    { "media_type": "image", "request_count": 800, "success_count": 790, "failed_count": 10, "total_billed": 4.00 },
    { "media_type": "video", "request_count": 450, "success_count": 410, "failed_count": 40, "total_billed": 8.50 }
  ],
  "custom_range": null
}
```

### GET /api/v1/account/prices

Get your configured per-category prices and quality profiles.

**Response:**
```json
{
  "price_per_image": 0.005,
  "price_per_video": 0.01,
  "price_per_audio": 0.008,
  "quality_profiles": [
    { "id": "balanced", "name": "Balanced", "description": "...", "billing_multiplier": 1 },
    { "id": "high", "name": "High", "description": "...", "billing_multiplier": 2 }
  ],
  "billing_note": "Each category is billed at price_per_type × quality_multiplier. Transcription adds a flat 1× price_per_type regardless of quality level."
}
```

### GET /api/v1/account/requests

List your past analysis requests with pagination and filters.

**Query Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `page` | integer | 1 | Page number |
| `limit` | integer | 50 | Results per page (max 100) |
| `status` | string | all | Filter: `success`, `partial`, `failed` |
| `media_type` | string | all | Filter: `image`, `video`, `audio` |
| `from` | string | — | Start date (YYYY-MM-DD) |
| `to` | string | — | End date (YYYY-MM-DD) |

**Response:**
```json
{
  "requests": [
    {
      "request_id": "a1b2c3d4-...",
      "status": "success",
      "media_type": "video",
      "categories": "[\"title\",\"description\"]",
      "quality_profile": "balanced",
      "billed_amount": 0.02,
      "execution_time": 8.4,
      "successful_categories": 2,
      "failed_categories": 0,
      "tokens_in": 1500,
      "tokens_out": 300,
      "agent_name": "general-v3",
      "created_at": "2024-03-20T10:00:00.000Z"
    }
  ],
  "total": 156,
  "page": 1,
  "limit": 50
}
```

### GET /api/v1/account/requests/{request_id}

Get full request detail including all analysis results. Checks active jobs first (KV), then falls back to the log database (D1) for historical requests.

**Response (completed):**
```json
{
  "request_id": "a1b2c3d4-...",
  "status": "completed",
  "media_type": "video",
  "agent_name": "general-v3",
  "categories": ["title", "description"],
  "quality_profile": "balanced",
  "created_at": "...",
  "meta": {
    "execution_time": 8.45,
    "tokens_in": 1500,
    "tokens_out": 300,
    "billed_amount": 0.02,
    "successful_categories": 2,
    "failed_categories": 0
  },
  "data": {
    "title": { "result": "..." },
    "description": { "result": "..." }
  },
  "pipeline_info": { ... },
  "pipeline_assets": { ... }
}
```

### POST /api/v1/account/requests/{request_id}/cancel

Cancel a job still in `queued` status. Jobs that have started processing cannot be cancelled.

**Response:**
```json
{ "success": true, "job_id": "a1b2c3d4-...", "status": "cancelled" }
```

---

## Billing

- **Per-category cost:** `price_per_{media_type} × billing_multiplier`
- **Transcription:** Adds a flat `1 × price_per_{media_type}` regardless of quality profile
- **Example:** 3 categories at $0.01/cat with 2× quality + transcription = `(3 × $0.01 × 2) + (1 × $0.01) = $0.07`

## Quality Profiles

| Profile | Multiplier | Description |
|---------|-----------|-------------|
| `low` | 1× | Fast, economical analysis |
| `balanced` | 1× | Standard quality (default) |
| `high` | 2× | Detailed analysis |
| `ultra` | 4× | Maximum quality with thorough scene analysis |

## Transcription

Set `transcribe: true` for audio/video media. Optional enhancements:
- `transcribe_diarize: true` — Speaker identification (adds `speaker` field to segments)
- `transcribe_align: true` — Word-level timestamps (adds `words` array to segments)

Transcript appears in `data.transcript` of the completed response and is automatically provided to VLM agents for richer context.

## Pipeline Info

When media is processed through the optimization pipeline, the response includes:
- **`pipeline_info`**: Media metadata (codec, resolution, duration, bitrate), face detection results, and source/download info. Structure varies by media type and source.
- **`pipeline_assets`**: URLs to generated assets including thumbnails, waveform visualization, and face crops.

Common `pipeline_info` fields:
- `metadata.streams[]` — Video/audio stream details (codec, resolution, framerate, etc.)
- `metadata.format` — Container format, duration, file size, bitrate
- `faces.faces[]` — Detected faces with timestamps and appearance counts
- `download_info` — Source URL metadata (title, description, thumbnail) when downloaded from a web URL

## Error Codes

| HTTP Status | Code | Description |
|-------------|------|-------------|
| 400 | `invalid_request` | Malformed JSON, missing/invalid media, invalid model UUID, invalid quality profile |
| 400 | `invalid_categories` | Requested categories do not exist or are disabled |
| 401 | `invalid_api_key` | Missing or incorrect Authorization header |
| 403 | `account_disabled` | API key is disabled |
| 404 | `not_found` | Job or resource not found/expired |
| 500 | `internal_error` | Server, pipeline, or transcription configuration error |

**Error Response Format:**
```json
{
  "error": true,
  "message": "Human-readable error description",
  "code": "error_code"
}
```

---

## Code Examples

### Python — Full Workflow
```python
import requests, time

API_KEY = "your_api_key"
BASE = "https://metavision.vip.api.efficientstack.com/api/v1"
H = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

# Submit
resp = requests.post(f"{BASE}/analyze", json={
    "media": "https://example.com/video.mp4",
    "media_type": "video",
    "categories": ["title", "description"],
    "detail": "balanced",
    "transcribe": True,
    "transcribe_diarize": True
}, headers=H)
job = resp.json()

# Poll
while True:
    result = requests.get(f"{BASE}/jobs/{job['job_id']}", headers=H).json()
    if result["status"] in ("completed", "failed"):
        break
    time.sleep(2)

# Result
print(result["data"])
if "transcript" in result.get("data", {}):
    print(result["data"]["transcript"])
if result.get("pipeline_info"):
    print(result["pipeline_info"])
```

### JavaScript — Full Workflow
```javascript
const API_KEY = 'your_api_key';
const BASE = 'https://metavision.vip.api.efficientstack.com/api/v1';
const H = {'Authorization': \`Bearer \${API_KEY}\`, 'Content-Type': 'application/json'};

const resp = await fetch(\`\${BASE}/analyze\`, {
  method: 'POST', headers: H,
  body: JSON.stringify({
    media: 'https://example.com/video.mp4',
    media_type: 'video',
    categories: ['title', 'description'],
    detail: 'balanced',
    transcribe: true
  })
});
const job = await resp.json();

let result;
while (true) {
  result = await (await fetch(\`\${BASE}/jobs/\${job.job_id}\`, {headers: H})).json();
  if (result.status === 'completed' || result.status === 'failed') break;
  await new Promise(r => setTimeout(r, 2000));
}

console.log(result.data);
console.log(result.pipeline_info);
console.log(result.pipeline_assets);
```
