POCO

POCO: 3D Pose and Shape Estimation using Confidence

Sai Kumar Dwivedi¹, Cordelia Schmid², Hongwei Yi¹, Michael J. Black¹, Dimitrios Tzionas³

¹Max Planck Institute for Intelligent Systems, Tuebingen
²Inria, Ecole normale superieure, CNRS, PSL Research University, France
³University of Amsterdam, the Netherlands

3DV 2024 (Oral)

arXiv Paper Code Poster Video Slides Contact

Most existing methods that regress 3D Human Pose and Shape (HPS) do not report their confidence (or uncertainty). An estimate of confidence, however, is needed by methods that “consume” the results of HPS. Even the best HPS methods struggle when the image evidence is weak or ambiguous. Our framework, POCO, extends existing HPS regressors to also estimate uncertainty in a single forward pass. The confidence (color-coded on the body) is correlated with the pose quality.

Abstract

The regression of 3D Human Pose and Shape HPS from an image is becoming increasingly accurate. This makes the results useful for downstream tasks like human action recognition or 3D graphics. Yet, no regressor is perfect, and accuracy can be affected by ambiguous image evidence or by poses and appearance that are unseen during training. Most current HPS regressors, however, do not report the confidence of their outputs, meaning that downstream tasks cannot differentiate accurate estimates from inaccurate ones. To address this, we develop POCO, a novel framework for training HPS regressors to estimate not only a 3D human body, but also their confidence, in a single feed-forward pass. Specifically, POCO estimates both the 3D body pose and a per-sample variance. The key idea is to introduce a Dual Conditioning Strategy (DCS) for regressing uncertainty that is highly correlated to pose reconstruction quality. The POCO framework can be applied to any HPS regressor and here we evaluate it by modifying HMR, PARE, and CLIFF. In all cases, training the network to reason about uncertainty helps it learn to more accurately estimate 3D pose. While this was not our goal, the improvement is modest but consistent. Our main motivation is to provide uncertainty estimates for downstream tasks; we demonstrate this in two ways: (1) We use the confidence estimates to bootstrap HPS training. Given unlabeled image data, we take the confident estimates of a POCO-trained regressor as pseudo ground truth. Retraining with this automatically-curated data improves accuracy. (2) We exploit uncertainty in video pose estimation by automatically identifying uncertain frames (e.g. due to occlusion) and inpainting these from confident frames.

Results

POCO extends the standard HPS regressors (e.g. CLIFF, ECCV 2022) to also output pose uncertainty.
Unlike previous methods (e.g. HuManiFlow, CVPR 2023 and ProHMR, ICCV 2021), it does so in a single forward pass.

Video

Code & Downloads

Code: https://github.com/saidwivedi/POCO
Downloads: Click here (or on "Downloads at top-right menu").
Note: The Downloads URL works only after sign in -- you need to register first and agree with our license.

Acknowledgements

We thank Partha Ghosh and Haiwen Feng for insightful discussions, Priyanka Patel for the CLIFF implementation, and Peter Kulits, Shashank Tripathi, Muhammed Kocabas, and the Perceiving Systems department for their feedback. SKD acknowledges support from the International Max Planck Research School for Intelligent Systems (IMPRS-IS). This work was partially supported by the German Federal Ministry of Education and Research (BMBF): Tubingen AI Center, FKZ: 01IS18039B.

Citation

@conference{dwivedi_3dv2023_poco,
  title = {{POCO}: {3D} Pose and Shape Estimation using Confidence},
  author = {Dwivedi, Sai Kumar and Schmid, Cordelia and Yi, Hongwei and Black, Michael J. and Tzionas, Dimitrios},
  booktitle =  {International Conference on 3D Vision (3DV)},
  month = Mar,
  year = {2024},
}

Contact

For questions, please contact poco@tue.mpg.de.

For commercial licensing, please contact ps-licensing@tue.mpg.de.