This project proposes a probabilistic model for learning from crowdsourced data by accounting for variability in annotator expertise. It aims to improve predictive accuracy in noisy labeling environments by modeling annotator performance based on input data and true labels.
The project uses Expectation-Maximization algorithms with the following approaches:
The dataset consists of radar data from Johns Hopkins University, labeled as good (g) or bad (b) by multiple annotators.
Findings
The project presents a robust framework for handling noisy and inconsistent annotations in crowdsourced data, making it valuable for applications like: medical image classification, NLP sentiment analysis, and social science surveys.
It demonstrates how regularization and probabilistic modeling can effectively manage variability in annotator expertise.
Lucas Reymond is proudly powered by Powered by WordPress.com.