Algorithms for video-based human motion capture typically assume the body shape is known a priori and is represented coarsely (e.g. using cylinders to model body parts). These body models stand in sharp contrast to the richly detailed 3D body models used by by animators. In this talk we describe a method for recovering detailed body shape models directly from images. Specifically, we represent the body using a recently proposed triangulated mesh model called SCAPE which employs a low-dimensional, but realistic, parametric model of human shape and pose-dependent deformations. Shape variations of the human body were learned from a database of 2400 ranges scans of people. We show how the parameters of the SCAPE model can be estimated directly from image data. In particular, the talk will look at the image cues available from strong lighting including cast shadows and shading. Rather than causing problems, we found that strong lighting improves 3D human pose and shape estimation making it practical even in monocular images.
Joint work with Alexandru Balan (Brown), Horst Haussecker (Intel) and James Davis (UCSC).