← Back to Home

IAB Categorization Service 🎥

Project Details

The IAB Categorization Service is a contextual targeting tool designed to enhance content prediction by leveraging the V3 IAB taxonomies. This service predicts and categorizes topics within specific content types, such as video transcripts, enabling businesses to refine their content targeting strategies.

IAB Categorization Workflow

The service is a key component of a broader workflow that ingests video content into AWS S3, transcribes it using the open-source Whisper model from Hugging Face, applies IAB categorization based on the transcripts, and finally exports the results to an external system. This automated pipeline streamlines content classification and enhances contextual understanding for digital advertising.

Key Features

My Contribution

I led the end-to-end design, development, and deployment of the IAB Categorization service and topic detection workflow. I designed the system architecture and implemented the workflow, trained a BERT-based model with 110M parameters on AWS SageMaker using 2 GPUs, and conducted experiments with different categorization approaches, including LLM-based few-shot prompting with ChatGPT Turbo 3.5. Additionally, I evaluated the performance of external categorization providers and led the decision-making process that selected GumGum Verity as the best-performing solution. I presented comparative results to management and customers, influencing the final product direction.

Challenges and Solutions

Outcome

Technologies

Python Machine Learning NLP Classification FastAPI IAB Taxonomy LLMs LangChain Whisper GumGum Verity ChatGPT BERT AWS Step Functions AWS SageMaker AWS ECS AWS ECR AWS RDS AWS SQS