OmniPhotos: Casual 360° VR Photography

Tobias Bertel, Mingze Yuan, Reuben Lindroos, Christian Richardt

ACM Transactions on Graphics(SIGGRAPH Asia 2020)(TOG), 2020

paper | project | abstract | bibtex

Until now, immersive 360° VR panoramas could not be captured casually and reliably at the same time as state-of-the-art approaches involve time-consuming or expensive capture processes that prevent the casual capture of real-world VR environments. Existing approaches are also often limited in their supported range of head motion. We introduce OmniPhotos, a novel approach for casually and reliably capturing high-quality 360° VR panoramas. Our approach only requires a single sweep of a consumer 360° video camera as input, which takes less than 3 seconds with a rotating selfie stick. The captured video is transformed into a hybrid scene representation consisting of a coarse scene-specific proxy geometry and optical flow between consecutive video frames, enabling 5-DoF real-world VR experiences. The large capture radius and 360° field of view significantly expand the range of head motion compared to previous approaches. Among all competing methods, ours is the simplest, and fastest by an order of magnitude. We have captured more than 50 OmniPhotos and show video results for a large variety of scenes. We will make our code and datasets publicly available.

  author    = {Tobias Bertel and Mingze Yuan and Reuben Lindroos and Christian Richardt},
  title     = {OmniPhotos: Casual 360° {VR} Photography},
  journal   = {ACM Transactions on Graphics},
  year      = {2020},
  volume    = {39},
  number    = {6},
  pages     = {266:1--12},
  month     = dec,
  issn      = {0730-0301},
  doi       = {10.1145/3414685.3417770},
  url       = {},

Temporal Upsampling of Depth Maps Using a Hybrid Camera

Mingze Yuan, Lin Gao, Hongbo Fu, Shihong Xia

IEEE Transactions on Visualization and Computer Graphics(TVCG), 2019

paper | YouTube | abstract | bibtex

In recent years, consumer-level depth cameras have been adopted for various applications. However, they often produce depth maps at only a moderately high frame rate (approximately 30 frames per second), preventing them from being used for applications such as digitizing human performance involving fast motion. On the other hand, low-cost, high-frame-rate video cameras are available. This motivates us to develop a hybrid camera that consists of a high-frame-rate video camera and a low-frame-rate depth camera and to allow temporal interpolation of depth maps with the help of auxiliary color images. To achieve this, we develop a novel algorithm that reconstructs intermediate depth maps and estimates scene flow simultaneously. We test our algorithm on various examples involving fast, non-rigid motions of single or multiple objects. Our experiments show that our scene flow estimation method is more precise than a tracking-based method and the state-of-the-art techniques.

  title={Temporal Upsampling of Depth Maps Using a Hybrid Camera},
  author={Yuanze and Gao, Lin and Fu, Hongbo and Xia, Shihong},
  journal={IEEE Transactions on Visualization and Computer Graphics},

SF-Net: Learning Scene Flow from RGB-D Images with CNNs

Yi-Ling Qiao, Lin Gao, Yukun Lai, Fang-Lue Zhang, Mingze Yuan, Shihong Xia

29TH British Machine Vision Conference (BMVC), 2018

paper | abstract | bibtex

With the rapid development of depth sensors, RGB-D data has become much more accessible. Scene flow is one of the fundamental ways to understand the dynamic content in RGB-D image sequences. Traditional approaches estimate scene flow using registration and smoothness or local rigidity priors, which is slow and prone to errors when the priors are not fully satisfied. To address such challenges, learning based methods provide an attractive alternative. However, trivially applying CNN-based optical flow estimation methods does not produce satisfactory results. How to use deep learning to improve the estimation of scene flow from RGB-D images remains unexplored. In this work, we propose a novel learning based framework to estimate scene flow, which takes both brightness and scene flow losses. Given a pair of RGB-D images, the brightness loss is used to measure the disparity between the first RGB-D image and the deformed second RGB-D image using the scene flow, and the scene flow loss is used to learn from the ground truth of scene flow. We build a convolutional neural network to simultaneously optimize both losses. Extensive experiments on both synthetic and real-world datasets show that our method is significantly faster than existing methods and outperforms stateof- the-art real-time methods in accuracy.

  title={SF-Net: Learning scene flow from RGB-D images with CNNs},
  author={Qiao, Yi-Ling and Gao, Lin and Lai, Yukun and 
    Zhang, Fang-Lue and Yuan, Mingze and Xia, Shihong},
  year={2018-29th British Machine Vision Conference},

A Survey on Human Performance Capture and Animation


Shihong Xia, Lin Gao, Yukun Lai, Mingze Yuan, Jinxiang Chai

Journal of Computer Science and Technology (JCST), 2017


paper | abstract | bibtex

With the rapid development of computing technology, three-dimensional (3D) human body models and their dynamic motions are widely used in the digital entertainment industry. Human performance mainly involves human body shapes and motions. Key research problems in human performance animation include how to capture and analyze static geometric appearance and dynamic movement of human bodies, and how to simulate human body motions with physical effects. In this survey, according to the main research directions of human body performance capture and animation, we summarize recent advances in key research topics, namely human body surface reconstruction, motion capture and synthesis, as well as physics-based motion simulation, and further discuss future research problems and directions. We hope this will be helpful for readers to have a comprehensive understanding of human performance capture and animation.

  title={A survey on human performance capture and animation},
  author={Xia, Shihong and Gao, Lin and Lai, Yu-Kun and Yuan, Mingze and Chai, Jinxiang},
  journal={Journal of Computer Science and Technology},