Human Body Measurement Estimation with Adversarial Augmentation

Nataniel Ruiz Miriam Bellver Timo Bolkart Ambuj Arora Ming C. Lin Javier Romero Raja Bala


Paper  |  Poster  |  Video  |  Dataset


We present a Body Measurement network (BMnet) for estimating 3D anthropomorphic measurements of the human body shape from silhouette images. Training of BMnet is performed on data from real human subjects, and augmented with a novel adversarial body simulator (ABS) that finds and synthesizes challenging body shapes. ABS is based on the skinned multiperson linear (SMPL) body model, and aims to maximize BMnet measurement prediction error with respect to latent SMPL shape parameters. ABS is fully differentiable with respect to these parameters, and trained end-to-end via backpropagation with BMnet in the loop. Experiments show that ABS effectively discovers adversarial examples, such as bodies with extreme body mass indices (BMI), consistent with the rarity of extreme-BMI bodies in BMnet's training set. Thus ABS is able to reveal gaps in training data and potential failures in predicting under-represented body shapes. Results show that training BMnet with ABS improves measurement prediction accuracy on real bodies by up to 10%, when compared to no augmentation or random body shape sampling. Furthermore, our method significantly outperforms SOTA measurement estimation methods by as much as 3x. Finally, we release BodyM, the first challenging, large-scale dataset of photo silhouettes and body measurements of real human subjects, to further promote research in this area.

BodyM Dataset

We present the first large public body measurement dataset including more than 8,000 frontal and lateral silhouettes from more than 2,500 real subjects, paired with height, weight and 14 body measurements. The figure above shows samples of silhouettes for a pair of subjects, including measurement bar charts with all 14 measurements and their respective height and weight. BodyM has been captured in a well-lit, indoor setup. Individuals are photographed in tight-fitting clothing and standing in A-Pose. 3D scans of each subject are acquired with a Treedy photogrammetric scanner, and registered to the SMPL mesh topology. The 14 body measurements are computed from these meshes. We break the dataset into 3 groups: Training, Test Set A, Test Set B. For the training and Test-A sets, subjects are photographed and 3D-scanned by in a lab by technicians. For the Test-B set, subjects are scanned in the lab, but photographed in a less-controlled environment with diverse camera orientations and lighting conditions, to simulate in-the-wild image capture. Some subjects have been photographed more than once with different clothing in order to test robustness of the dataset. The statistics of the 3 data subsets are shown in the table below, and breakdown of gender and body shape in terms of body-mass-index (BMI) is reported in Table 2 of the paper.

License: Creative Commons Attribution-Non Commercial 4.0 International Public License:

Simulated Adversarial Training

Our differentiable simulator is comprised of the SMPL model M that takes in shape β and pose θ. We generate height ϵ and weight ω using a trained regressor h, generate measurements using a deterministic measuring function g operating on vertices, and we render the silhouettes using a renderer R. In order to simulate adversarial bodies, we optimize the shape β by maximizing a loss between the ground-truth measurements and the measurements estimated by our model f. The resulting shape parameters give rise to an adversarial body, which are then used to train the model f.

Diagnosing Model Weaknesses

We show that adversarially simulated bodies are concentrated in certain regions of body shape space. Here we can see that adversarial bodies are concentrated in negative regions of the 1st and 2nd shape component and have higher loss than random bodies. We also observe that the negative directions of these components correspond to taller and wider bodies.

State-of-the-Art Results

SotA compared to off-the-shelf body reconstruction methods
We compare our method to SotA off-the-shelf body shape estimation methods such as SPIN, STRAPS and Sengupta et al. (ICCV 2021) and we find that our method outperforms them on body measurement estimation. Our method uses only one silhouette and no height and weight data for fairness.

SotA compared to other body measurement estimation methods
We compare against the SotA methods for body measurement estimation and find that we outperform them overall, and over a variety of specific measurements on two different datasets.

Simulated adversarial training improves body measurement estimation
We perform an ablation study over several setups (respectively: single-view, multi-view, and multi-view w/ 10x less real training data). We find that simulated adversarial body augmentation outperforms no augmentation and augmentation using randomly sampled simulated bodies.