Birds of a Feather
Capturing Avian Shape Models from Images
|
|
Abstract
Animals are diverse in shape, but building a deformable shape model for a new species is not always possible due to the lack of 3D data. We present a method to capture new species using an articulated template and images of that species. In this work, we focus mainly on birds. Although birds represent almost twice the number of species as mammals, no accurate shape model is available. To capture a novel species, we first fit the articulated template to each training sample. By disentangling pose and shape, we learn a shape space that captures variation both among species and within each species from image evidence. We learn models of multiple species from the CUB dataset, and contribute new species-specific and multi-species shape models that are useful for downstream reconstruction tasks. Using a low-dimensional embedding, we show that our learned 3D shape space better reflects the phylogenetic relationships among birds than learned perceptual features.
Video
Overview
Starting from a collection of images for a given species (e.g. from the CUB Dataset), we first align a generic articulated model to the keypoints and silhouette for each image (A). We then deform the model to capture the mean species shape by fitting a single shape to all samples (B). Finally, we express identity-specific offsets from the mean shape as linear combinations of a learned shape basis (C).
|
Results
|
We create AVES, a new multi-species avian model and PCA shape space. By applying our method to 17 representative species, we learn a shape space that captures characteristics such as the length of the bill, tail, and legs, the torso aspect ratio, and even dynamic shape changes not explicitly captured by the model's articulation such as elevation of a head crest in some species.
|
Recovered shapes are correlated with the phylogeny. We visualize a UMAP embedding of species shape coefficients along with their phylogeny (thin black lines), revealing several examples of convergent evolution for long tails, waterbird body shape, and head crests (A). Shape variation across species has high phylogenetic signal and is correlated with the phylogeny. On the other hand, visual features extracted using a ResNet50 embedding network trained on CUB are not correlated with the phylogeny, despite clustering well (B). Avian shape captured by our shape space is therefore a more reliable phylogenetic trait than learned perceptual features.
Species-specific Models
|
Laysan Albatross
|
|
California Gull
|
|
Mallard
|
|
Green Kingfisher
|
|
Horned Puffin
|
|
Boat-tailed Grackle
|
|
Cedar Waxwing
|
|
Geococcyx
|
|
Cardinal
|
|
Blue Jay
|
|
Northern Flicker
|
|
Pileated Woodpecker
|
|
Painted Bunting
|
|
White-breasted Kingfisher
|
|
American Crow
|
|
Scissor-tailed Flycatcher
|
|
Evening Grosbeak
|
Application to Dogs
Our method can also be applied to other animals such as dogs in the StanfordExtra Dogs Dataset.
|
Ibizan Hound
|
Acknowledgements
We gratefully appreciate support through the following grants: NSF IIS 1703319, NSF MRI 1626008, NSF TRIPODS 1934960, and NSF CPS 2038873.
The design of this project page was based on this website.