A State of the Art Report on Kinect Sensor Setups in Computer Vision

Berger, Kai; Meister, Stephan; Nair, Rahul; Kondermann, Daniel

doi:10.1007/978-3-642-44964-2_12

Kai Berger²⁰,
Stephan Meister²¹,
Rahul Nair²¹ &
…
Daniel Kondermann²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8200))

4934 Accesses
16 Citations

Abstract

During the last three years after the launch of the Microsoft Kinect® in the end-consumer market we have become witnesses of a small revolution in computer vision research towards the use of a standardized consumer-grade RGBD sensor for scene content retrieval. Beside classical localization and motion capturing tasks the Kinect has successfully been employed for the reconstruction of opaque and transparent objects. This report gives a comprehensive overview over the main publications using the Microsoft Kinect out of its original context as a decision-forest based motion-capturing tool.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Albers, M., Berger, B.K., Magnor, E.P.D.I.M.: The capturing of turbulent gas flows using multiple kinects. Bachelor thesis, Technical University Braunschweig (2012)
Google Scholar
Aydemir, A., Henell, D., Jensfelt, P., Shilkrot, R.: Kinect@ home: Crowdsourcing a large 3d dataset of real environments. In: 2012 AAAI Spring Symposium Series (2012)
Google Scholar
Bartczak, B., Koch, R.: Dense depth maps from low resolution time-of-flight depth and high resolution color views. In: Bebis, G., et al. (eds.) ISVC 2009, Part II. LNCS, vol. 5876, pp. 228–239. Springer, Heidelberg (2009)
Chapter Google Scholar
Berger, K., Ruhl, K., Albers, M., Schroder, Y., Scholz, A., Kokemuller, J., Guthe, S., Magnor, M.: The capturing of turbulent gas flows using multiple kinects. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1108–1113. IEEE (2011)
Google Scholar
Berger, K., Ruhl, K., Brümmer, C., Schröder, Y., Scholz, A., Magnor, M.: Markerless motion capture using multiple color-depth sensors. In: Proc. Vision, Modeling and Visualization (VMV), vol. 2011, p. 3 (2011)
Google Scholar
Van den Bergh, M., Carton, D., De Nijs, R., Mitsou, N., Landsiedel, C., Kuehnlenz, K., Wollherr, D., Van Gool, L., Buss, M.: Real-time 3D hand gesture interaction with a robot for understanding directions from humans. In: 2011 IEEE RO-MAN, pp. 357–362. IEEE (2011)
Google Scholar
Besl, P.J., McKay, N.D.: A method for registration of 3-d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence 14(2), 239–256 (1992)
Article Google Scholar
Breiman, L.: Random forests. Machine learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 60–65. IEEE (2005)
Google Scholar
Butler, D.A., Izadi, S., Hilliges, O., Molyneaux, D., Hodges, S., Kim, D.: Shake’n’sense: Reducing interference for overlapping structured light depth cameras. In: Proceedings of the 2012 ACM Annual Conference on Human Factors in Computing Systems, pp. 1933–1936. ACM (2012)
Google Scholar
Bylow, E., Sturm, J., Kerl, C., Kahl, F., Cremers, D.: Real-time camera tracking and 3d reconstruction using signed distance functions. In: Robotics: Science and Systems Conference (RSS) (2013)
Google Scholar
Camplani, M., Salgado, L.: Efficient spatio-temporal hole filling strategy for kinect depth maps. In: International Society for Optics and Photonics, IS&T/SPIE Electronic Imaging, p. 82900E (2012)
Google Scholar
Chen, J., Izadi, S., Fitzgibbon, A.: Kinêtre: Animating the world with the human body. In: Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology, pp. 435–444. ACM (2012)
Google Scholar
Chen, L., Lin, H., Li, S.: Depth image enhancement for kinect using region growing and bilateral filter. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 3070–3073. IEEE (2012)
Google Scholar
Chiu, W.C., Blanke, U., Fritz, M.: Improving the kinect by cross-modal stereo. In: 22nd British Machine Vision Conference (BMVC) (2011)
Google Scholar
Dal Mutto, C., Zanuttigh, P., Cortelazzo, G.M.: A probabilistic approach to tof and stereo data fusion. In: 3DPVT, Paris, France (May 2010)
Google Scholar
Danciu, G., Banu, S.M., Caliman, A.: Shadow removal in depth images morphology-based for kinect cameras. In: 2012 16th International Conference on System Theory, Control and Computing (ICSTCC), pp. 1–6. IEEE (2012)
Google Scholar
Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., Burgard, W.: An evaluation of the rgb-d slam system. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1691–1696. IEEE (2012)
Google Scholar
Faion, F., Friedberger, S., Zea, A., Hanebeck, U.D.: Intelligent sensor-scheduling for multi-kinect-tracking. In: Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS) (2012)
Google Scholar
Fischer, J., Arbeiter, G., Verl, A.: Combination of time-of-flight depth and stereo using semiglobal optimization. In: Int. Conf. on Robotics and Automation (ICRA), pp. 3548–3553. IEEE (2011)
Google Scholar
Frati, V., Prattichizzo, D.: Using kinect for hand tracking and rendering in wearable haptics. In: 2011 IEEE World Haptics Conference (WHC), pp. 317–321. IEEE (2011)
Google Scholar
Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 415–422. IEEE (2011)
Google Scholar
Gottfried, J.-M., Fehr, J., Garbe, C.: Computing range flow from multi-modal kinect data. Advances in Visual Computing, 758–767 (2011)
Google Scholar
Gudmundsson, S.A., Aanaes, H., Larsen, R.: Fusion of stereo vision and time-of-flight imaging for improved 3D estimation. IJISTA 5(3), 425–433 (2008)
Google Scholar
Hahne, U., Alexa, M.: Combining time-of-flight depth and stereo images without accurate extrinsic calibration. IJISTA 5(3), 325–333 (2008)
Article Google Scholar
Hahne, U., Alexa, M.: Depth imaging by combining time-of-flight and on-demand stereo. In: Kolb, A., Koch, R. (eds.) Dyn3D 2009. LNCS, vol. 5742, pp. 70–83. Springer, Heidelberg (2009)
Chapter Google Scholar
Han, J., Shao, L., Xu, D., Shotton, J.: Enhanced computer vision with microsoft kinect sensor: A review. IEEE Transactions on Cybernetics (2013)
Google Scholar
Handa, A., Newcombe, R.A., Angeli, A., Davison, A.J.: Real-Time camera tracking: when is high frame-rate best? In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 222–235. Springer, Heidelberg (2012), http://link.springer.com/chapter/10.1007/978-3-642-33786-4_17
Chapter Google Scholar
Henry, P., Fox, D., Bhowmik, A., Mongia, R.: Patch Volumes: Segmentation-based Consistens Mapping with RGB-D Cameras. In: International Conference on 3D Vision 2013 (3DV) (2013)
Google Scholar
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using depth cameras for dense 3d modeling of indoor environments. In: The 12th International Symposium on Experimental Robotics (ISER), vol. 20, pp. 22–25 (2010)
Google Scholar
Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments. The International Journal of Robotics Research 31(5), 647–663 (2012)
Article Google Scholar
Heredia, F., Favier, R.: Point cloud library developers blog, kinfu large scale (June 18, 2012), http://www.pointclouds.org/blog/srcs/
Daniel Herrera, C., Kannala, J., Heikkilä, J.: Accurate and practical calibration of a depth and color camera pair. In: Real, P., Diaz-Pernil, D., Molina-Abril, H., Berciano, A., Kropatsch, W. (eds.) CAIP 2011, Part II. LNCS, vol. 6855, pp. 437–445. Springer, Heidelberg (2011)
Chapter Google Scholar
Hu, G., Huang, S., Zhao, L., Alempijevic, A., Dissanayake, G.: A robust rgb-d slam algorithm. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2012)
Google Scholar
Huang, J., Lee, A.B., Mumford, D.: Statistics of range images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 324–331. IEEE (2000)
Google Scholar
Huhle, B., Fleck, S., Schilling, A.: Integrating 3D time-of-flight camera data and high resolution images for 3Dtv applications. In: Proc. 3DTV Conf. IEEE (2007)
Google Scholar
Huhle, B., Schairer, T., Jenke, P., Straßer, W.: Robust non-local denoising of colored depth data. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2008), pp. 1–7. IEEE (2008)
Google Scholar
Izadi, S., Newcombe, R.A., Kim, D., Hilliges, O., Molyneaux, D., Hodges, S., Kohli, P., Shotton, J., Davison, A.J., Fitzgibbon, A.: KinectFusion: Real-time dynamic 3D surface reconstruction and interaction. In: ACM SIGGRAPH 2011 Talks, p. 23. ACM (2011)
Google Scholar
Kate Solomon - techradar.com: Meerkats to go Ultra HD in BBC’s first 4K broadcast, http://www.techradar.com/news/tv/television/meerkats-togo-ultra-hd-in-bbcs-first-4k-broadcast-1127915/
Keller, M., Lefloch, D., Lambers, M., Izadi, S., Weyrich, T., Kolb, A.: Real-time 3D Reconstruction in Dynamic Scenes using Point-based Fusion. In: International Conference on 3D Vision 2013 (3DV) (2013)
Google Scholar
Kerl, C., Sturm, J., Cremers, D.: Robust odometry estimation for rgb-d cameras. In: Proc. of the IEEE Int. Conf. on Robotics and Automation (ICRA) (May 2013)
Google Scholar
Kuhnert, K., Stommel, M.: Fusion of stereo-camera and pmd-camera data for real-time suited precise 3d environment reconstruction. In: Int. Conf. on Intelligent Robots and Systems, pp. 4780–4785. IEEE (2006)
Google Scholar
Laurentini, A.: The visual hull concept for silhouette-based image understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence 16(2), 150–162 (1994)
Article Google Scholar
Lee, T., Lim, S., Lee, S., An, S., Oh, S.: Indoor mapping using planes extracted from noisy rgb-d sensors. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2012)
Google Scholar
Lenzen, F., Schäfer, H., Garbe, C.: Denoising time-of-flight data with adaptive total variation. In: Bebis, G. (ed.) ISVC 2011, Part I. LNCS, vol. 6938, pp. 337–346. Springer, Heidelberg (2011)
Google Scholar
Leyvand, T., Meekhof, C., Wei, Y.C., Sun, J., Guo, B.: Kinect identity: Technology and experience. Computer 44(4), 94–96 (2011)
Google Scholar
Lysenkov, I., Eruhimov, V.: Pose refinement of transparent rigid objects with a stereo camera. In: 22th International Conference on Computer Graphics and Vision (GraphiCon 2012) (2012)
Google Scholar
Lysenkov, I., Eruhimov, V., Bradski, G.: Recognition and pose estimation of rigid transparent objects with a kinect sensor. In: Robotics: Science and Systems VIII, Sydney, Australia (2012)
Google Scholar
Mac Aodha, O., Campbell, N.D.F., Nair, A., Brostow, G.J.: Patch based synthesis for single depth image super-resolution. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 71–84. Springer, Heidelberg (2012)
Chapter Google Scholar
Maimone, A., Fuchs, H.: Reducing interference between multiple structured light depth sensors using motion. In: 2012 IEEE Virtual Reality Workshops (VR), pp. 51–54. IEEE (2012)
Google Scholar
Mardia, K., Dryden, I.: The statistical analysis of shape data. Biometrika 76(2), 271–281 (1989)
Google Scholar
Meister, S., Izadi, S., Kohli, P., Hämmerle, M., Rother, C., Kondermann, D.: When can we use kinectfusion for ground truth acquisition? In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Workshops & Tutorials (2012)
Google Scholar
Microsoft Corporation: Kinect for windows sdk, http://www.microsoft.com/enus/kinectforwindows/
Microsoft News Center: Microsoft press release (March 2010), http://www.microsoft.com/en-us/news/press/2010/mar10/03-31PrimeSensePR.aspx
Microsoft Xbox support: Room lighting conditions for kinect, http://support.xbox.com/en-US/xbox-360/kinect/lighting/
Nair, R., Lenzen, F., Meister, S., Schäfer, H., Garbe, C., Kondermann, D.: High accuracy TOF and stereo sensor fusion at interactive rates. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012 Ws/Demos, Part II. LNCS, vol. 7584, pp. 1–11. Springer, Heidelberg (2012)
Chapter Google Scholar
Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: Real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, vol. 7, pp. 127–136 (2011)
Google Scholar
Newman, P., Ho, K.: Slam-loop closing with visually salient features. In: Proceedings of the 2005 IEEE International Conference on Robotics and Automation (ICRA 2005), pp. 635–642. IEEE (2005)
Google Scholar
Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3D reconstruction and tracking. In: Second International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), pp. 524–530. IEEE (2012)
Google Scholar
Nowozin, S., Rother, C., Bagon, S., Sharp, T., Yao, B., Kohli, P.: Decision tree fields. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1668–1675. IEEE (2011)
Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient model-based 3d tracking of hand articulations using kinect. BMVC (August 2, 2011)
Google Scholar
Openkinect Project: libfreenect, http://openkinect.org/
OpenNI: Openni framework, http://www.openni.org
Raheja, J.L., Chaudhary, A., Singal, K.: Tracking of fingertips and centers of palm using kinect. In: 2011 Third International Conference on Computational Intelligence, Modelling and Simulation (CIMSiM), pp. 248–252. IEEE (2011)
Google Scholar
Raptis, M., Kirovski, D., Hoppe, H.: Real-time classification of dance gestures from skeleton animation. In: Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 147–156. ACM (2011)
Google Scholar
Roth, H., Vona, M.: Moving volume kinectfusion. In: British Machine Vision Conf. (BMVC), Surrey, UK (2012)
Google Scholar
Ruhl, K., Klose, F., Lipski, C., Magnor, M.: Integrating approximate depth data into dense image correspondence estimation. In: Proceedings of the 9th European Conference on Visual Media Production, pp. 26–31. ACM (2012)
Google Scholar
Schoner, H., Moser, B., Dorrington, A.A., Payne, A.D., Cree, M.J., Heise, B., Bauer, F.: A clustering based denoising technique for range images of time of flight cameras. In: 2008 International Conference on Computational Intelligence for Modelling Control & Automation, pp. 999–1004. IEEE (2008)
Google Scholar
Schröder, Y., Berger, K., Magnor, M.: Super resolution for active light sensor enhancement. Bachelor thesis, University of Braunschweig (March 2012)
Google Scholar
Schröder, Y., Scholz, A., Berger, K., Ruhl, K., Guthe, S., Magnor, M.: Multiple kinect studies. Computer Graphics (2011)
Google Scholar
Schnauer, C., Kaufmann, H.: Wide area motion tracking using consumer hardware. In: Proceedings of Workshop on Whole Body Interaction in Games and Entertainment, Advances in Computer Entertainment Technology (ACE 2011), Lisbon, Portugal (2011)
Google Scholar
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1297–1304. IEEE (2011)
Google Scholar
Smisek, J., Jancosek, M., Pajdla, T.: 3d with kinect. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1154–1160. IEEE (2011)
Google Scholar
Somanath, G., Cohen, S., Price, B., Kambhamettu, C.: Stereo+Kinect for High Resolution Stereo Correspondences. In: International Conference on 3D Vision 2013 (3DV) (2013)
Google Scholar
Steinbrücker, F., Sturm, J., Cremers, D.: Real-time visual odometry from dense rgb-d images. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 719–722. IEEE (2011)
Google Scholar
Stuckler, J., Behnke, S.: Integrating depth and color cues for dense multi-resolution scene mapping using rgb-d cameras. In: 2012 IEEE Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 162–167. IEEE (2012)
Google Scholar
Stückler, J., Behnke, S.: Multi-resolution surfel maps for efficient dense 3D modeling and tracking. Journal of Visual Communication and Image Representation (2013)
Google Scholar
Tam, G., Cheng, Z.Q., Lai, Y.K., Langbein, F., Liu, Y., Marshall, A., Martin, R., Sun, X.F., Rosin, P.: Registration of 3d point clouds and meshes: A survey from rigid to non-rigid. IEEE Transactions on Visualization and Computer Graphics PP(99), 1 (2012)
Google Scholar
Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision, pp. 839–846. IEEE (1998)
Google Scholar
Wang, F., Zhang, C.: Feature extraction by maximizing the average neighborhood margin. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2007), pp. 1–8. IEEE (2007)
Google Scholar
Weickert, J.: Anisotropic diffusion in image processing, vol. 1. Teubner Stuttgart (1998)
Google Scholar
Whelan, T., Johannsson, H., Kaess, M., Leonard, J.J., McDonald, J.: Robust real-time visual odometry for dense rgb-d mapping. In: IEEE Intl. Conf. on Robotics and Automation (ICRA), Karlsruhe, Germany (2013)
Google Scholar
Whelan, T., Kaess, M., Fallon, M., Johannsson, H., Leonard, J., McDonald, J.: Kintinuous: Spatially extended kinectfusion. Technical Report MIT-CSAIL-TR-2012-020, CSAIL Technical Reports (2012), http://hdl.handle.net/1721.1/71756
Woodford, O., Torr, P., Reid, I., Fitzgibbon, A.: Global stereo reconstruction under second-order smoothness priors. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(12), 2115–2128 (2009)
Article Google Scholar
Xu, K., Zhou, J., Wang, Z.: A method of hole-filling for the depth map generated by kinect with moving objects detection. In: 2012 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), pp. 1–5. IEEE (2012)
Google Scholar
Yang, C., Medioni, G.: Object modelling by registration of multiple range images. Image and Vision Computing 10(3), 145–155 (1992)
Article Google Scholar
Zeng, M., Zhao, F., Zheng, J., Liu, X.: A memory-efficient kinectFusion using octree. In: Hu, S.-M., Martin, R.R. (eds.) CVM 2012. LNCS, vol. 7633, pp. 234–241. Springer, Heidelberg (2012)
Chapter Google Scholar
Zhu, J., Wang, L., Yang, R., J., Davis, J., et al.: Reliability fusion of time-of-flight depth and stereo for high quality depth maps. TPAMI (99), 1 (2011)
Google Scholar
Zollhöfer, M., Martinek, M., Greiner, G., Stamminger, M., Süßmuth, J.: Automatic reconstruction of personalized avatars from 3D face scans. Computer Animation and Virtual Worlds 22(2-3), 195–202 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

OeRC Oxford, University of Oxford, UK
Kai Berger
Heidelberg Collaboratory for Image Processing, University of Heidelberg, Germany
Stephan Meister, Rahul Nair & Daniel Kondermann

Authors

Kai Berger
View author publications
You can also search for this author in PubMed Google Scholar
Stephan Meister
View author publications
You can also search for this author in PubMed Google Scholar
Rahul Nair
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Kondermann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Pattern Recognition Group, University of Siegen, Siegen, Germany
Marcin Grzegorzek
Max-lanck-Institute, Graphics, Vision & Video Gruop, Saarbrücken, Germany
Christian Theobalt
Multimedia Information Processing Group, University of Kiel, Kiel, Germany
Reinhard Koch
Computer Graphics and Multimedia Systems Group, University of Siegen, Siegen, Germany
Andreas Kolb

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Berger, K., Meister, S., Nair, R., Kondermann, D. (2013). A State of the Art Report on Kinect Sensor Setups in Computer Vision. In: Grzegorzek, M., Theobalt, C., Koch, R., Kolb, A. (eds) Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications. Lecture Notes in Computer Science, vol 8200. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-44964-2_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-44964-2_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-44963-5
Online ISBN: 978-3-642-44964-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics