ΑΙhub.org
 

Long-term monitoring of bird flocks in the wild – interview with Kshitiz


by
08 February 2024



share this:
flock of birds flying

In work presented at the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023), Kshitiz, Sonu Shreshtha, Ramy Mounir, Mayank Vatsa, Richa Singh, Saket Anand, Sudeep Sarkar and Sevaram Mali Parihar investigate using computer vision techniques to monitor large flocks of birds. In this interview, Kshitiz tells us more about this research.

What is the topic of the research in your paper?

In our work, Long-term Monitoring of Bird Flocks in the Wild, published in IJCAI 2023, we delve into developing and applying computer vision techniques and datasets tailored for non-invasive monitoring and analysis of migratory bird flocks in their natural habitats. The aim is to understand the behavior and ecology of migratory birds through automated video analysis with minimal human intervention, thereby bolstering conservation initiatives.

The core technical challenges associated with wildlife monitoring arise from the uncontrolled, outdoor nature of the imagery (both images and videos) capturing large flocks of migratory birds over several months. The inherent variabilities in that footage depict true monitoring conditions, including illumination, weather, complex motion, and birds’ poses on the ground and in flight. Therefore, to solve those challenges, we intended to collect extensive high-resolution video data at sites, primarily Khichan, which hosts thousands of Demoiselle Cranes over winter months. Furthermore, additional sample data featuring diverse bird species from the UNESCO World Heritage site of Keoladeo National Park bird sanctuary have been collected.

The research aims to curate bird samples labeled under human supervision to aid researchers in this area. To analyze the annotated imagery, the paper benchmarks and seeks to improve computer vision techniques like crowd counting, segmentation, detection, and tracking. Preliminary results led to a new annotated image and video dataset with unique challenges compared to existing ones. Experiments showed contemporary methods’ limitations, especially on densely populated flocks. This highlights the need for specialized techniques adept at real-world wildlife monitoring. There are plans to expand the video dataset annotations and develop new algorithms inspired by self-supervised learning, active learning, and cognitive science. These algorithms aim to understand bird behavior and interactions better over time.

Could you tell us about the implications of your research and why it is an interesting area for study?

The research aims to develop techniques to analyze bird behaviors from imagery collected in natural environments automatically. This capability can enable large-scale ethograming of wildlife without human intervention, providing unbiased insights into intricate ecological behaviors. However, creating comprehensive ethograms requires tracking movements without relying on manual labeling. Further, acquiring these datasets is time-intensive and demands considerable human effort, making the process inefficient and often impractical for large-scale applications. This is challenging given their complex motion, postures, occlusions, cluttered backgrounds, and lighting variability. Understanding specific behaviors from automated monitoring helps gain insights into:

  • Exposing evolutionary adaptations: Observing animals in their natural habitats can reveal behaviors that evolved in response to environmental pressures.
  • Assessing environmental impacts: Automated monitoring can determine the effects of environmental changes on animal behaviors.
  • Investigating year-round life cycles: Understanding behaviors across different geographies and seasons is crucial for conservation, especially for migratory animals.

This research is crucial considering the alarming worldwide bird population decline driven by threats like habitat loss, climate change, and urbanization. However, developing these systems requires developing advanced vision techniques that can help extract nuanced contextual information from the complex natural environment.

Could you explain your methodology?

The research methodology utilizes a comprehensive approach combining data collection, advanced computer vision techniques, and collaborative efforts with local experts to monitor and analyze the behavior of migratory birds in their natural habitats. The study aims to address the existing challenges of non-invasive wildlife monitoring and provide critical insights to develop informed conservation and mitigation policies. A substantial component of the methodology involved collecting extensive data representing true monitoring conditions across several months. The objective was to compile footage (images and videos) that could subsequently be analyzed to derive meaningful conclusions. A key acknowledgment in the development of the algorithm is that occlusion poses a significant challenge that must be mitigated to obtain better results. Therefore, we curated a unique high-resolution (image and video) dataset of the migratory cranes that travel to western India every year during winter, with images of up to 4K quality showcasing flock density under diverse real-world conditions such as variable lighting and perspectives. The research also introduces an end-to-end pipeline that accepts images as input and is further analyzed using several tasks, including crowd counting/density estimation and segmentation, to get cues regarding the collective behavior of avian flocks. Additionally, to overcome manual annotation challenges, we aim to leverage active and self-supervised learning techniques for accurate flock estimation.

What were your main findings?

One major finding of our research involved curating a novel bird monitoring dataset comprising high-resolution images and videos with point annotations. The dataset contained highly dense flocks of birds in their natural habitat. The research identified that existing tools, such as the megadetector toolkit, which was trained on the largest public and private diverse wildlife datasets, struggled to detect all birds in the images from the new dataset. This underscores the need to develop specialized computer vision techniques tailored for wildlife datasets, especially those with high-density subjects like flocks of birds. Upon conducting experiments for several vision tasks on the proposed dataset, we observed that pre-training models allowed for improved performance in specific tasks like bird counting, as the models could learn relevant features that were then refined upon fine-tuning. The research highlighted the shortcomings of several state-of-the-art algorithms when applied to the proposed dataset. In the case of segmentation, results from the recent Segment Anything model also showcased limitations in our cases, depicting the dataset’s challenging nature and the inherent challenges that occur with wildlife datasets.

The research is part of a broader project on Video Analytics for Wildlife Conservation under the Indo-US collaboration. By collaborating with local experts, especially those familiar with the birds at the Khichan sites, the team gains insights that are invaluable for understanding these birds’ ecological and evolutionary processes. The insights derived from this research can be pivotal in predicting future impacts on bird species and in formulating informed conservation and mitigation policies.

What further work are you planning in this area?

Our research team, in pursuit of advancing wildlife monitoring, is exploring multiple approaches in designing novel algorithms that address the unique challenges in non-invasive wildlife monitoring. We are keen on improving the crowd counting and density estimation techniques, enhancing semantic segmentation and species identification with fewer samples. To overcome the manual annotation challenges, especially with the dense flocks, we are also using synthetic data generation along with leveraging unlabeled data through the potential of self-supervised learning. Overall, the future work aims to push the current boundaries, enabling better understanding of bird behavior and ecology.

About Kshitiz

Kshitiz

Kshitiz, is a final-year Computer Science undergraduate student at IIT Jodhpur. His areas of interest include machine learning, deep learning, and computer vision.

Read the research in full

Long-term Monitoring of Bird Flocks in the Wild, Kshitiz, Sonu Shreshtha, Ramy Mounir, Mayank Vatsa, Richa Singh, Saket Anand, Sudeep Sarkar, Sevaram Mali Parihar.



tags: , ,


Lucy Smith is Senior Managing Editor for AIhub.
Lucy Smith is Senior Managing Editor for AIhub.

            AIhub is supported by:



Subscribe to AIhub newsletter on substack



Related posts :

Interview with AAAI Fellow Yan Liu: machine learning for time series

  19 Mar 2026
Hear from 2026 AAAI Fellow Yan Liu about her research into time series, the associated applications, and the promise of physics-informed models.

A principled approach for data bias mitigation

  18 Mar 2026
Find out more about work presented at AIES 2025 which proposes a new way to measure data bias, along with a mitigation algorithm with mathematical guarantees.

An AI image generator for non-English speakers

  17 Mar 2026
"Translations lose the nuances of language and culture, because many words lack good English equivalents."

AI and Theory of Mind: an interview with Nitay Alon

  16 Mar 2026
Find out more about how Theory of Mind plays out in deceptive environments, multi-agents systems, the interdisciplinary nature of this field, when to use Theory of Mind, and when not to, and more.
coffee corner

AIhub coffee corner: AI, kids, and the future – “generation AI”

  13 Mar 2026
The AIhub coffee corner captures the musings of AI experts over a short conversation.

AI chatbots can effectively sway voters – in either direction

  12 Mar 2026
A short interaction with a chatbot can meaningfully shift a voter’s opinion about a presidential candidate or proposed policy.

Studying the properties of large language models: an interview with Maxime Meyer

  11 Mar 2026
What happens when you increase the prompt length in a LLM? In the latest interview in our AAAI Doctoral Consortium series, we sat down with Maxime, a PhD student in Singapore.

What the Moltbook experiment is teaching us about AI

An experimental social media platform where only AI bots can post reveals surprising lessons about artificial intelligence behaviour and safety.



AIhub is supported by:







Subscribe to AIhub newsletter on substack




 















©2026.02 - Association for the Understanding of Artificial Intelligence