TwelveLabsTwelveLabsDocumentationGuide

Contextual Ad Engine

Upload broadcast footage, let TwelveLabs’ multimodal models identify ideal ad breaks, rank ads by viewer affinity and scene fit, and preview ad-injected video — all from a single dashboard.

Full application walkthrough

Architecture diagram

The Contextual Ad Gap

Streaming platforms insert ads based on demographic targeting alone — a beer ad plays during a somber funeral scene; a luxury car ad interrupts a comedic moment. The mismatch erodes viewer trust and brand equity.

TwelveLabs’ multimodal models unlock a new primitive: scene-level understanding. By knowing exactly what is happening in every second of video — the emotion, the environment, the objects, the dialogue — we can match ads to moments that amplify rather than interrupt the viewer experience.

40+

Ads in inventory

3

User Profiles

512

Vector Embedding Dimensions

3

Safety modes

Core Features

Everything needed to go from raw footage to precision-targeted ad placement.

Real-time profile ad switching

Switch between the three custom demo personas in the player. Ranked ad recommendations and the injected preview update immediately — no reload — so you can compare how scene fit and user affinity reshape the same break in real time.

Databricks contextual lift

Export Marengo-backed ad metadata and impressions-ready rows to Delta Lake via the Databricks SQL driver. Join TwelveLabs scene IDs with warehouse analytics to measure contextual lift, benchmark campaigns, and operationalize the pipeline beside your existing BI stack.

Semantic Search

Search video inventory by meaning, not keywords. Describe a scene, emotion, or moment — Marengo embeddings surface the exact timestamp across all indexed content.

Multiplicative Ad Scoring

Two-stage formula: scene fit (Marengo vector similarity + environment/tone) multiplied by user affinity. Scene context gates viewer preference — not the reverse.

Cross-Break Diversity

Ensures no single ad or category dominates a broadcast. Selected ads are hard-excluded from future breaks; category caps prevent over-representation.

Ad-Injected Preview

Generate a full broadcast preview with ads injected at computed break points. Ad skip logic, playback controls, and downloadable JSON plan included.

How It Works

Four core steps from raw footage to ranked, diverse ad placement.

1

Pegasus 1.5 Scene Classification and Content Taxonomy 3.1

Ad semantic IAB uses TwelveLabs Pegasus 1.5 in time-based mode: a `scene_classification` segment returns a short `scene_description` per time range. Each description is embedded with OpenAI `text-embedding-3-small` and compared (cosine similarity) to every node in the taxonomy embedding database — a k-nearest style pass that keeps the top candidates per scene and maps the best match to the exact Content Taxonomy 3.1 node id (`iab_id` in the JSON, surfaced as `taxonomyNodeId` in the API). Results are cached to Vercel Blob.

api/adInventoryIabSemantic/route.jsjavascript
const created = await tlClient.analyzeAsync.tasks.create({
  modelName: "pegasus1.5",
  video: resolvedVideo.video,
  analysisMode: "time_based_metadata",
  responseFormat: {
    type: "segment_definitions",
    segmentDefinitions: [{
      id: "scene_classification",
      fields: [{ name: "scene_description", type: "string", /* ... */ }],
    }],
  },
  minSegmentDuration: 5,
  maxSegmentDuration: 30,
});
// Per scene: OpenAI text-embedding-3-small → cosine vs taxonomy_embeds.json
// Best node.iab_id = exact CT 3.1 id; topK=5 for taxonomyTopCandidates
const score = cosineSimilarity(sceneEmbedding, node.embedding);
pushTopCosine(taxonomyTopCandidates, { iab_id, breadcrumb, cosine_similarity: score }, 5);
2

Marengo Vector Embeddings

TwelveLabs' Marengo model generates 512-dimensional embeddings for both video segments and ad creatives. Cosine similarity between these vectors provides scene-to-ad semantic matching that keyword matching cannot achieve.

api/embeddings/route.js — adPlacementEngine.tstypescript
// Cosine similarity between ad creative and scene segment vectors
function cosineSimilarity(a: number[], b: number[]): number {
  let dot = 0, nA = 0, nB = 0;
  for (let i = 0; i < a.length; i++) {
    dot += a[i] * b[i]; nA += a[i] ** 2; nB += b[i] ** 2;
  }
  return dot / (Math.sqrt(nA) * Math.sqrt(nB));
}

// Stretch tight cosine range [0.35–0.75] → full [0–1] scale
const rawSim = cosineSimilarity(ad.vector, segment.vector);
const SIM_MIN = 0.35, SIM_MAX = 0.75;
let contextMatch = (rawSim - SIM_MIN) / (SIM_MAX - SIM_MIN);
contextMatch = Math.pow(Math.max(0, Math.min(1, contextMatch)), 1.5);
3

Two-Stage Multiplicative Scoring

Ads are ranked using a two-stage formula: scene fit multiplied by user affinity. Scene fit acts as a gate — even a viewer's favourite ad category scores near zero in the wrong scene.

lib/adPlacementEngine.tstypescript
// Stage 1 — sceneFit: does this ad BELONG in this scene?
const sceneFit =
  suitableMatch  * 0.15 +  // Pegasus suitable_categories hit
  environmentFit * 0.15 +  // Environment × category affinity table
  toneCompat     * 0.10 +  // Emotional tone compatibility
  contextMatch   * 0.60;   // Marengo vector similarity (dominant)

// Stage 2 — adAffinity: pre-computed user → ad category score
const totalScore = adAffinity * sceneFit;

// Diversity: no ad wins twice; category cap = ceil(breaks / 2)
return selectAdsWithDiversity(ranked, plan, config);
4

Databricks Delta Lake Export

Ad metadata and clip-averaged Marengo vectors are exported to a Databricks Delta table via a single POST request. The marengo_embedding_json column holds a JSON float array ready for Mosaic AI Vector Search indexing.

api/databricks/export/route.js — lib/databricksExportSql.tssql
-- Generated by buildAdMetadataExportSql()
-- Target: main.default.ad_metadata_premium_spirits
-- marengo_embedding_json: JSON float[] — cast with
--   from_json(marengo_embedding_json, 'array<double>') for Vector Search
CREATE OR REPLACE TABLE `main`.`default`.`ad_metadata_premium_spirits` AS
SELECT * FROM VALUES
  ('grey_goose_30s.mp4', 'Grey Goose — The Art of Enjoyment', 30,
   '["Bar scene","Cocktail","Celebration"]',
   '["Adults","HHI $100K+"]', '["Underage"]',
   '{"highPriority":["Luxury","Premium Spirits"]}',
   '["Children programming","Violence"]', '[]',
   '[0.021,-0.034,0.119,...]', 512, 'twelvelabs_marengo',
   'embedded_marengo_clip_avg')
AS v(creative_id, campaign_name, duration_sec, extracted_visual_contexts,
     target_demographics, negative_demographics, target_audience_affinity,
     negative_campaign_contexts, brand_safety_garm, marengo_embedding_json,
     embedding_dim, embedding_model, vector_sync_status);

Why TwelveLabs?

TwelveLabs provides the foundational models that power every inference in this engine.

Marengo 3.0 — Multimodal Embeddings

State-of-the-art video representation model that encodes visual, audio, and textual content into a unified 512-dimensional vector space. Powers semantic scene matching in this engine.

Pegasus 1.5 — Generative Video Understanding

Drives time-based scene classification and metadata (including semantic Content Taxonomy 3.1 alignment via OpenAI embeddings) — a core input to ad break and IAB labeling in this app.

Enterprise-Grade Infrastructure

SOC 2 compliant, built for scale. Process thousands of hours of video via a simple REST API with consistent, predictable pricing and 99.9% uptime.

Research-Backed Innovation

TwelveLabs' research team continuously improves model accuracy across visual, audio, and textual modalities — improving ad relevance with every model release.

Business Impact

Contextual placement drives measurable outcomes for publishers, brands, and viewers.

For Publishers
  • Higher CPMsContext-matched inventory commands premium rates over run-of-network.
  • Brand Safety at ScaleAutomated GARM classification prevents costly misplacements.
  • Viewer RetentionRelevant ads reduce skip rates and improve completion metrics.
For Advertisers
  • Scene-Fit VerificationKnow exactly which scenes your creative appears in before buying.
  • Cross-Break DiversityFrequency controls prevent over-exposure and ad fatigue.
  • Affinity-Matched AudiencesViewer cohort scoring ensures the right person sees the right ad.

Technology Stack

LayerTechnologyPurpose
FrontendNext.js 15 + React 19App Router, SSR, and real-time client computation
Video AITwelveLabs APIMarengo embeddings + Pegasus scene analysis
Ad EngineTypeScript Pure FunctionsDeterministic scoring with useMemo for instant re-ranking
StylingTailwind CSS v4 + Strand DSTwelveLabs brand design system
StorageVercel BlobVideo analysis cache + embedding cache
StreamingHLS.js + CloudFront CDNAdaptive bitrate playback for content + ad videos
Data PlatformDatabricks Delta LakeAd metadata + Marengo embedding export for Mosaic AI Vector Search

Ready to see contextual ad placement in action?

Upload a video, explore the ad inventory, and watch the engine rank ads in real time as you switch viewer profiles.