Day of Data 2021: Scaling and Integrating Pipelines for Bioacoustics Big Data Analysis
From Sarah Wright
Effective monitoring of biodiversity provides key information on the status of wildlife populations and is critical for conservationists and decision-makers worldwide. One of the most cost-effective and non-invasive approaches for monitoring wildlife populations is through bioacoustic monitoring. Currently, there are millions of hours of recordings worldwide. However, the immense volume of biological information that is captured in these bioacoustic data sets remains largely untapped. The analysis of big acoustic datasets presents challenges that are shared beyond acoustics, presenting opportunities for exchange and learning across domains. Bioacoustic analysis pipelines generally include multiple steps. Each of these steps must be scaled to enable analysis of massive datasets and connected to other steps to enable end-to-end analysis. In practice, the steps that make up these pipelines are often developed by different researchers and research groups, making explicit attention to data structure and conventions even more critical. The objective of this panel discussion is to bring together different groups at Cornell who are working on different aspects of the bioacoustic analysis pipeline, and discuss how to best align and integrate efforts. We will discuss multiple data problems related to bioacoustics, including interfacing human intelligence and artificial intelligence, managing overlapping sounds and creating active learning data loops. We will have panelists from the Center for Conservation Bioacoustics, the Macaulay Library of Natural Sounds, both based at the Cornell Lab of Ornithology.