Understanding consensus in labeled UAS images of waterfowl
Topics:
Keywords: UAS, waterfowl, deep learning, machine learning, citizen science
Abstract Type: Paper Abstract
Authors:
Rowan Converse, University of New Mexico
Christopher Lippitt, University of New Mexico
,
,
,
,
,
,
,
,
Abstract
Crowdsourcing is a common method to source the large pools of image labels required to train deep learning pipelines for image-based wildlife monitoring; however, aerial imagery presents unique challenges to image interpretation, and the ability of volunteers to accurately label wildlife in aerial images has not been assessed. We evaluate consensus of image labels in two pools of labeled UAS imagery of waterfowl to understand underlying variability within crowdsourced datasets. Classification and count consensus per image are compared between one pool of >1,400,000 crowdsourced image classifications generated by 5,503 volunteers, and one pool of >19,000 image classifications generated by fifteen professional wildlife biologists. These pools of labels were refined into an expert consensus set of 2,375 labels and a crowdsourced consensus set of 150,307 labels. Average individual agreement with group consensus for three morphological classes (duck/goose/crane) was 0.82 for experts and 0.74 for volunteers. In the expert label set, average agreement on identification of twelve species of ducks ranged from 0.83 for the most common species to 0.43 for rare species classes. Agreement between experts and the crowd based on a twelve-image benchmark was 0.91 for overall counts and ranged from 0.80 to 0.91 for counts of the three morphological classes of interest. These results indicate that both expert crowdsourced labels are likely reliable for generating counts and identifications of general morphological classes of waterfowl from UAS imagery, but that counts by species, even when performed by experts, are likely not reliable.
Understanding consensus in labeled UAS images of waterfowl
Category
Paper Abstract