- Computerized image analysis aids wildlife conservationists
- Virtual supercomputers from Microsoft streamline image processing
- Combination of algorithms, machine learning, and HPC will preserve giraffe populations
Giraffe numbers have declined precipitously across Africa due to habitat loss and illegal killing for meat, and to counter this trend, scientists are enlisting supercomputers.
Using advanced computational methods to study the births, deaths, and movements of giraffes, Project GIRAFFE aims to reverse declining populations.
Focusing their research across the 4,000-square kilometer Tarangire ecosystem, scientists comb through thousands of digital photographs. These photos show each giraffe’s unique and unchanging spots — patterns identifying individuals throughout their lives.
Analyzing these photos is an exacting and time-consuming process, and each one must be manually cropped to show only a giraffe’s torso to the pattern recognition software.
Microsoft scientists have helped automate the image processing through machine learning technology deployed on the Microsoft Azure cloud. The Microsoft team uses a computer vision object detection algorithm, and has trained a program to recognize giraffe torsos from existing annotated giraffe photos.
The program progresses via an Active Learning process: The system identifies new images and displays its predicted cropping squares to a human who quickly verifies or corrects the results, and then feeds these new images back into the training algorithm to further improve the program.
"The new system dramatically speeds up the important research being performed by Wild Nature Institute (WNI) scientists," says Derek Lee, quantitative ecologist, population biologist, and principal scientist at the WNI “It used to take us a week to process our new images after a survey, now it is done in minutes.”
To perform the millions of calculations required to match his giraffe photos, Lee received a grant from Microsoft Azure and their cloud computing service to build virtual supercomputers. The Azure virtual machines would be very expensive to build, but the computing power was rented from Microsoft Azure for the few weeks necessary to run a year’s batch of giraffe matching.
The 3 Vs
Big data programs are often described by the three 'V’s' — Velocity, Volume, and Variety, and Project GIRAFFE is growing in all three.
“The Volume (amount of data) and Variety (type and nature of data) are already staggering,” says Lee. “We take about 9000 high-resolution photos every year, and these photos contain hundreds of gigabytes of data on giraffe identity, location, social relationships, age, size, reproductive status, disease, space and habitat use, and much more.”
“Velocity recently increased in a huge step function as our computer science collaborators developed amazing new tools to speed up data processing,” continues Lee. “We are always looking for more partners from the computer science, informatics, and cybernetics worlds to improve our information processing systems.”
The variety of data streams is also increasing as WNI invites participation from village game scouts, tourism operators, and ecotourists on safari. Involving stakeholders is integral to monitoring how giraffes respond to community conservation zones known as Wildlife Management Areas.
WNI partners at the PAMS Foundation are supporting intelligence-based anti-poaching activities to protect elephants and giraffes in the Tarangire ecosystem, and the institute is documenting how the wildlife populations respond to their efforts.
Armed with data-driven conservation knowledge products, WNI scientists and partners are working to protect and connect the areas that are most important to giraffe survival and reproduction.
As Lee exclaims, “We are turning big data into tall data to save giraffes!”
Read Derek Lee's original blog entries about Project GIRAFFE here.