A Semi-Supervised Active Learning Approach for Block-Status Classification.
Topics:
Keywords:
Abstract Type: Paper Abstract
Authors:
Atul Rawal, xD, Census Bureau
James McCoy, Census Bureau
Andrew Duvall, Census Bureau
Elvis Martinez, Census Bureau
,
,
,
,
,
,
Abstract
The Census Bureau, as a part of its decennial census must maintain and update all the addresses present within the United States and its territories. For the 2020 Census, in-office staff manually canvassed address coverage in every block. While this process was effective, it also brought about challenges associated with cost and time. To help aide the Census Bureau in labelling and classifying coverage at the block level, we propose a machine learning approach via semi-supervised learning. We present a robust machine learning solution to improve both data labeling and classification of block data to enable new data-driven insight while reducing costs and effort for data assessment. Towards this goal, we have employed an active-learning scheme to make accurate and precise classifications using <50,000 labelled bocks out of the 8,000,000+ within the country. We utilized multiple machine learning and deep learning models to make predictions on unlabeled data by training the model on the smaller set of labelled data. Predictions from all the models are then compared to pinpoint the blocks where there is a mismatch between the different models. Upon validation, the predicted data is then added to the training data before making predictions on the next subset of the data. Additionally, we present the use of explainable AI (XAI) as an added resource to identify potential biases within the data and the predictions. Finally, we discuss the different challenges associated with working on real-world data at this scale such as class-imbalance, data completeness, and data integrity.
A Semi-Supervised Active Learning Approach for Block-Status Classification.
Category
Paper Abstract
Description
Submitted by:
Atul Rawal
atul.rawal@census.gov
This abstract is part of a session. Click here to view the session.
| Slides