TikTok Virality Predictor

Key Learnings
- Complex deep learning model architecture selection
- Video data processing and importance of feature engineering
- The importance of a quality data collection process and the power of good data
- How difficult it is to predict virality
Project Overview
I built a machine learning model to predict the viral potential of TikTok videos using deep learning and computer vision techniques. This one was pretty tough since I had never worked with video data in deep learning before, but it was a great learning experience. Project scope included:
- Web scraping TikTok videos and associated metrics
- Engineering custom virality score as response variable combining metrics such as likes, comments, shares, and views
- Implementing ResNet(2+1)D architecture for video analysis
- Compared performance with different methods like regression to predict the virality score vs simplifying to a binary classification problem
- Model performance was subpar, and future work could be focused on adjusting architecture for including audio embeddings as well as incorporating more data such as current trends
Technical Details
- Developed data collection pipeline using Python and Pandas
- Scraped TikTok video data using open source tool yt-dlp
- Fine tuned ResNet(2+1)D model using PyTorch and adjusted architecture for single node output layer
- Diagnosed model performance by exploring top and bottom performing videos and comparing to actual virality to find opportunities for improvement
Technologies Used
Python
PyTorch
ResNet
Pandas
Web Scraping
Deep Learning
Computer Vision