Contextual intelligence engine powering Mundial Media
Tri-modal contextual intelligence for multicultural ad targeting. Built by Ramón Cendejas.
Classifies publisher web pages in real-time to deliver contextual ads without cookies. Uses only the content the reader is currently viewing, not user identity or browsing history.
Uses domain-native training data curated by multicultural engineers instead of keyword blocklists or standard IAB categories. Understands Spanglish, code-switching, and bilingual patterns natively.
General-market tools over-block culturally relevant content: immigration reporting flagged as "controversial," Dia de los Muertos imagery as "morbid," "slays" in Black cultural context as "violence." Cadmus knows the difference.
Contextual segment IDs injected into ad request before reaching Google Ad Manager (GAM). Pre-delivery brand safety, not post-bid verification where waste has already occurred.
Crawl to ad delivery in five stages
Dual-Path:
Offline (Batch): Continuous crawlers parse publisher pages, extracting URLs, text, images. Trains and updates models.
Real-Time (Active): When a user visits a page, a Dynamic Ad Tag fires. Active Parser extracts HTML, splits into text + images.
Three analyzers run in parallel:
NLP / Text Semantic meaning, topics, cultural context, bilingual patterns
Image / Deep Learning Objects, scenes, visual context, explicit content detection
Sentiment Emotional tone, crime reporting vs. glorification
Weighted formula using proprietary omega weights tuned on multicultural data. Same content on different publisher types triggers different calculations.
Fast-Fail: Explicit content or blacklisted segments = immediate rejection.
If page passes thresholds, Cadmus assigns Contextual Segment IDs injected into ad request before reaching GAM. GAM matches IDs against campaign targeting. Ad serves only if segments match.
In-Memory Cache: Fast lookups on repeat ad requests for the same page. Articles are analyzed once, results cached.
Database: Persistent storage for classification history and model training feedback loops.
Four top-level unsafe content clusters:
Dense clustering, clear boundaries. High confidence in detection and separation from safe content.
Dense clustering, clear boundaries. Distinguishes reporting on conflict from glorifying it.
Moderately clustered. Requires contextual judgment to separate news coverage from exploitative content.
Broadly distributed. Requires nuanced weight tuning. This is where cultural calibration matters most.
Word-level clustering to tri-modal contextual engine
Custom word embeddings from a domain-specific corpus (not a general LLM). Unsupervised clustering to find natural groupings (athletes, musicians, etc.) without forced categories. Human labeling step to name clusters. Basic image analysis via AWS Rekognition. Already generating revenue at this stage.
Moved from word-level to sentence-level analysis. Added sentiment analysis (reporting on crime vs. glorifying it). More advanced image analysis. System now understands meaning in context, not just word presence.
Brand safety overhaul: word variants (k1ll, ki11, s3x) mapped back to canonical forms so they cluster correctly instead of being smeared across all clusters. Dramatically improved safety scoring accuracy. Basic video support via metadata extraction (not frame-by-frame). Tri-modal parallel processing (text + image + sentiment).
Ramón should define the next major version with input from those who talk to brands and publishers. Current wishlist items (see Roadmap) are incremental. The challenge is identifying the next "big jump."
Shared AWS, separate schemas per platform
Mundial Media Platform Architecture
Backend only. No front end.
Contextual engine.
Internal platform.
Backend + Frontend + DB
Publisher-facing.
Frontend + DB
Newest. Prod only.
No dev environment yet.
Same database, different schemas per platform
Segment IDs injected before ad serving
| Platform | DB / Schemas | Dev Environment | Prod Environment | Notes |
|---|---|---|---|---|
| Aries (Internal) | ✓ | ✓ | ✓ | Manages data for what the Publisher Platform shows |
| Cadmus AI | ✓ | ✓ | ✓ | Backend-only engine. No front end. |
| Publisher Platform | ✓ | ✓ | ✓ | Frontend-only. Data fed from Aries backend. |
| Consumer Insights | ✓ | ? | ✓ | Javier spinning up dev env. Currently prod-only. Reads from same prod DB. |
Prioritization TBD with leadership
Build classifiers on the fly from past content for ad hoc insights. Sales team's top ask: gives them a reason to contact clients proactively. Achievable relatively quickly per Ramón.
Weight content above the fold higher than below, especially on mobile. Untested but expected to improve accuracy.
Google's taxonomy has more topics than IAB. Confirmed after reviewing competitor (Grana) approach. Expands segment granularity.
Migrate extraction pipeline from Python to Go for speed. Helps with dynamic classifiers and faster document iteration. Rust considered but Go has better library support.
Use LLMs to generate richer video descriptions from titles and metadata. Previously blocked by Google API costs, now feasible with Claude.
Frame-by-frame video analysis (transcribe audio, capture one frame/sec, run image analysis). What Grana/GumGum does. "Completely different animal" per Ramón. Would not build on top of Cadmus.
Integrate first-party demographic data with Cadmus contextual data. Currently using third-party sources (SimilarWeb, Semrush). Would resolve the tension between "we don't need cookies" and needing demographic info.
Tell publishers what topics will have upcoming ad demand so they can create matching content. E.g., "Automotive campaigns increasing 15% next quarter." Ramón's idea. No one currently offers this.
Demo material showing how Cadmus processes a page step by step. Sales enablement, not a product feature.
Availability and downtime tracking.
Page processing latency. Critical for real-time ad serving.
Classification accuracy. A/B testing across versions. Competitive benchmarking.
Cadmus and broader tech ecosystem
Organizational and technical
Ramón is the only person who fully understands Cadmus. No documentation beyond Javier's Technical Reference Card.
Ramón gets pulled into non-Cadmus tasks constantly. SOW mandates 70%+ allocation to Cadmus but this requires active protection.
Alex, Jeff, and Adrian building separate tools using different databases, GitHub repos, and APIs with zero coordination, governance, or security standards.
Ramón has never talked directly to a brand or client. No follow-up after campaign wrap-ups. No direct market signal on what clients want from Cadmus.
Sales positions video capabilities to clients, but Cadmus only does metadata-based video classification (not frame-by-frame). Descriptions left intentionally vague.
No way to measure whether Cadmus is improving or degrading. No A/B testing between versions. No competitive benchmarking.
Focus: Product governance, roadmap discipline, engineering allocation, tool acceleration.
Mandate: Prioritize Cadmus. Move stalled tools to production. Eliminate manual workflows.
Focus: AI transformation architecture. Governance and security for vibe coding. Tool standardization.
Three Phases: (1) Automation (ad ops, RFPs, publisher onboarding), (2) Team adoption, (3) Revenue-generating AI (bid optimization, real-time campaign optimization).