Birds of a Feather

Capturing Avian Shape Models from Images

Yufu Wang Nikos Kolotouros Kostas Daniilidis Marc Badger
We capture species-specific shape models from image collections. The new shape models not only articulate but can also deform according to species-specific shape deformation modes. We combine models from diverse species to learn a multi-species model.

Abstract

Animals are diverse in shape, but building a deformable shape model for a new species is not always possible due to the lack of 3D data. We present a method to capture new species using an articulated template and images of that species. In this work, we focus mainly on birds. Although birds represent almost twice the number of species as mammals, no accurate shape model is available. To capture a novel species, we first fit the articulated template to each training sample. By disentangling pose and shape, we learn a shape space that captures variation both among species and within each species from image evidence. We learn models of multiple species from the CUB dataset, and contribute new species-specific and multi-species shape models that are useful for downstream reconstruction tasks. Using a low-dimensional embedding, we show that our learned 3D shape space better reflects the phylogenetic relationships among birds than learned perceptual features.


Video


Overview

Starting from a collection of images for a given species (e.g. from the CUB Dataset), we first align a generic articulated model to the keypoints and silhouette for each image (A). We then deform the model to capture the mean species shape by fitting a single shape to all samples (B). Finally, we express identity-specific offsets from the mean shape as linear combinations of a learned shape basis (C).


Results

We create AVES, a new multi-species avian model and PCA shape space. By applying our method to 17 representative species, we learn a shape space that captures characteristics such as the length of the bill, tail, and legs, the torso aspect ratio, and even dynamic shape changes not explicitly captured by the model's articulation such as elevation of a head crest in some species.


Recovered shapes are correlated with the phylogeny. We visualize a UMAP embedding of species shape coefficients along with their phylogeny (thin black lines), revealing several examples of convergent evolution for long tails, waterbird body shape, and head crests (A). Shape variation across species has high phylogenetic signal and is correlated with the phylogeny. On the other hand, visual features extracted using a ResNet50 embedding network trained on CUB are not correlated with the phylogeny, despite clustering well (B). Avian shape captured by our shape space is therefore a more reliable phylogenetic trait than learned perceptual features.


Species-specific Models

Laysan Albatross
California Gull
Mallard
Green Kingfisher
Horned Puffin
Boat-tailed Grackle
Cedar Waxwing
Geococcyx
Cardinal
Blue Jay
Northern Flicker
Pileated Woodpecker
Painted Bunting
White-breasted Kingfisher
American Crow
Scissor-tailed Flycatcher
Evening Grosbeak

Application to Dogs

Our method can also be applied to other animals such as dogs in the StanfordExtra Dogs Dataset.

Ibizan Hound

Acknowledgements

We gratefully appreciate support through the following grants: NSF IIS 1703319, NSF MRI 1626008, NSF TRIPODS 1934960, and NSF CPS 2038873.

The design of this project page was based on this website.