Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8200))

Abstract

During the last three years after the launch of the Microsoft Kinect® in the end-consumer market we have become witnesses of a small revolution in computer vision research towards the use of a standardized consumer-grade RGBD sensor for scene content retrieval. Beside classical localization and motion capturing tasks the Kinect has successfully been employed for the reconstruction of opaque and transparent objects. This report gives a comprehensive overview over the main publications using the Microsoft Kinect out of its original context as a decision-forest based motion-capturing tool.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Albers, M., Berger, B.K., Magnor, E.P.D.I.M.: The capturing of turbulent gas flows using multiple kinects. Bachelor thesis, Technical University Braunschweig (2012)

    Google Scholar 

  2. Aydemir, A., Henell, D., Jensfelt, P., Shilkrot, R.: Kinect@ home: Crowdsourcing a large 3d dataset of real environments. In: 2012 AAAI Spring Symposium Series (2012)

    Google Scholar 

  3. Bartczak, B., Koch, R.: Dense depth maps from low resolution time-of-flight depth and high resolution color views. In: Bebis, G., et al. (eds.) ISVC 2009, Part II. LNCS, vol. 5876, pp. 228–239. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  4. Berger, K., Ruhl, K., Albers, M., Schroder, Y., Scholz, A., Kokemuller, J., Guthe, S., Magnor, M.: The capturing of turbulent gas flows using multiple kinects. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1108–1113. IEEE (2011)

    Google Scholar 

  5. Berger, K., Ruhl, K., Brümmer, C., Schröder, Y., Scholz, A., Magnor, M.: Markerless motion capture using multiple color-depth sensors. In: Proc. Vision, Modeling and Visualization (VMV), vol. 2011, p. 3 (2011)

    Google Scholar 

  6. Van den Bergh, M., Carton, D., De Nijs, R., Mitsou, N., Landsiedel, C., Kuehnlenz, K., Wollherr, D., Van Gool, L., Buss, M.: Real-time 3D hand gesture interaction with a robot for understanding directions from humans. In: 2011 IEEE RO-MAN, pp. 357–362. IEEE (2011)

    Google Scholar 

  7. Besl, P.J., McKay, N.D.: A method for registration of 3-d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence 14(2), 239–256 (1992)

    Article  Google Scholar 

  8. Breiman, L.: Random forests. Machine learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  9. Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2, pp. 60–65. IEEE (2005)

    Google Scholar 

  10. Butler, D.A., Izadi, S., Hilliges, O., Molyneaux, D., Hodges, S., Kim, D.: Shake’n’sense: Reducing interference for overlapping structured light depth cameras. In: Proceedings of the 2012 ACM Annual Conference on Human Factors in Computing Systems, pp. 1933–1936. ACM (2012)

    Google Scholar 

  11. Bylow, E., Sturm, J., Kerl, C., Kahl, F., Cremers, D.: Real-time camera tracking and 3d reconstruction using signed distance functions. In: Robotics: Science and Systems Conference (RSS) (2013)

    Google Scholar 

  12. Camplani, M., Salgado, L.: Efficient spatio-temporal hole filling strategy for kinect depth maps. In: International Society for Optics and Photonics, IS&T/SPIE Electronic Imaging, p. 82900E (2012)

    Google Scholar 

  13. Chen, J., Izadi, S., Fitzgibbon, A.: Kinêtre: Animating the world with the human body. In: Proceedings of the 25th Annual ACM Symposium on User Interface Software and Technology, pp. 435–444. ACM (2012)

    Google Scholar 

  14. Chen, L., Lin, H., Li, S.: Depth image enhancement for kinect using region growing and bilateral filter. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 3070–3073. IEEE (2012)

    Google Scholar 

  15. Chiu, W.C., Blanke, U., Fritz, M.: Improving the kinect by cross-modal stereo. In: 22nd British Machine Vision Conference (BMVC) (2011)

    Google Scholar 

  16. Dal Mutto, C., Zanuttigh, P., Cortelazzo, G.M.: A probabilistic approach to tof and stereo data fusion. In: 3DPVT, Paris, France (May 2010)

    Google Scholar 

  17. Danciu, G., Banu, S.M., Caliman, A.: Shadow removal in depth images morphology-based for kinect cameras. In: 2012 16th International Conference on System Theory, Control and Computing (ICSTCC), pp. 1–6. IEEE (2012)

    Google Scholar 

  18. Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., Burgard, W.: An evaluation of the rgb-d slam system. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 1691–1696. IEEE (2012)

    Google Scholar 

  19. Faion, F., Friedberger, S., Zea, A., Hanebeck, U.D.: Intelligent sensor-scheduling for multi-kinect-tracking. In: Proc. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS) (2012)

    Google Scholar 

  20. Fischer, J., Arbeiter, G., Verl, A.: Combination of time-of-flight depth and stereo using semiglobal optimization. In: Int. Conf. on Robotics and Automation (ICRA), pp. 3548–3553. IEEE (2011)

    Google Scholar 

  21. Frati, V., Prattichizzo, D.: Using kinect for hand tracking and rendering in wearable haptics. In: 2011 IEEE World Haptics Conference (WHC), pp. 317–321. IEEE (2011)

    Google Scholar 

  22. Girshick, R., Shotton, J., Kohli, P., Criminisi, A., Fitzgibbon, A.: Efficient regression of general-activity human poses from depth images. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 415–422. IEEE (2011)

    Google Scholar 

  23. Gottfried, J.-M., Fehr, J., Garbe, C.: Computing range flow from multi-modal kinect data. Advances in Visual Computing, 758–767 (2011)

    Google Scholar 

  24. Gudmundsson, S.A., Aanaes, H., Larsen, R.: Fusion of stereo vision and time-of-flight imaging for improved 3D estimation. IJISTA 5(3), 425–433 (2008)

    Google Scholar 

  25. Hahne, U., Alexa, M.: Combining time-of-flight depth and stereo images without accurate extrinsic calibration. IJISTA 5(3), 325–333 (2008)

    Article  Google Scholar 

  26. Hahne, U., Alexa, M.: Depth imaging by combining time-of-flight and on-demand stereo. In: Kolb, A., Koch, R. (eds.) Dyn3D 2009. LNCS, vol. 5742, pp. 70–83. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  27. Han, J., Shao, L., Xu, D., Shotton, J.: Enhanced computer vision with microsoft kinect sensor: A review. IEEE Transactions on Cybernetics (2013)

    Google Scholar 

  28. Handa, A., Newcombe, R.A., Angeli, A., Davison, A.J.: Real-Time camera tracking: when is high frame-rate best? In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 222–235. Springer, Heidelberg (2012), http://link.springer.com/chapter/10.1007/978-3-642-33786-4_17

    Chapter  Google Scholar 

  29. Henry, P., Fox, D., Bhowmik, A., Mongia, R.: Patch Volumes: Segmentation-based Consistens Mapping with RGB-D Cameras. In: International Conference on 3D Vision 2013 (3DV) (2013)

    Google Scholar 

  30. Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using depth cameras for dense 3d modeling of indoor environments. In: The 12th International Symposium on Experimental Robotics (ISER), vol. 20, pp. 22–25 (2010)

    Google Scholar 

  31. Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using kinect-style depth cameras for dense 3d modeling of indoor environments. The International Journal of Robotics Research 31(5), 647–663 (2012)

    Article  Google Scholar 

  32. Heredia, F., Favier, R.: Point cloud library developers blog, kinfu large scale (June 18, 2012), http://www.pointclouds.org/blog/srcs/

  33. Daniel Herrera, C., Kannala, J., Heikkilä, J.: Accurate and practical calibration of a depth and color camera pair. In: Real, P., Diaz-Pernil, D., Molina-Abril, H., Berciano, A., Kropatsch, W. (eds.) CAIP 2011, Part II. LNCS, vol. 6855, pp. 437–445. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  34. Hu, G., Huang, S., Zhao, L., Alempijevic, A., Dissanayake, G.: A robust rgb-d slam algorithm. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2012)

    Google Scholar 

  35. Huang, J., Lee, A.B., Mumford, D.: Statistics of range images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 324–331. IEEE (2000)

    Google Scholar 

  36. Huhle, B., Fleck, S., Schilling, A.: Integrating 3D time-of-flight camera data and high resolution images for 3Dtv applications. In: Proc. 3DTV Conf. IEEE (2007)

    Google Scholar 

  37. Huhle, B., Schairer, T., Jenke, P., Straßer, W.: Robust non-local denoising of colored depth data. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2008), pp. 1–7. IEEE (2008)

    Google Scholar 

  38. Izadi, S., Newcombe, R.A., Kim, D., Hilliges, O., Molyneaux, D., Hodges, S., Kohli, P., Shotton, J., Davison, A.J., Fitzgibbon, A.: KinectFusion: Real-time dynamic 3D surface reconstruction and interaction. In: ACM SIGGRAPH 2011 Talks, p. 23. ACM (2011)

    Google Scholar 

  39. Kate Solomon - techradar.com: Meerkats to go Ultra HD in BBC’s first 4K broadcast, http://www.techradar.com/news/tv/television/meerkats-togo-ultra-hd-in-bbcs-first-4k-broadcast-1127915/

  40. Keller, M., Lefloch, D., Lambers, M., Izadi, S., Weyrich, T., Kolb, A.: Real-time 3D Reconstruction in Dynamic Scenes using Point-based Fusion. In: International Conference on 3D Vision 2013 (3DV) (2013)

    Google Scholar 

  41. Kerl, C., Sturm, J., Cremers, D.: Robust odometry estimation for rgb-d cameras. In: Proc. of the IEEE Int. Conf. on Robotics and Automation (ICRA) (May 2013)

    Google Scholar 

  42. Kuhnert, K., Stommel, M.: Fusion of stereo-camera and pmd-camera data for real-time suited precise 3d environment reconstruction. In: Int. Conf. on Intelligent Robots and Systems, pp. 4780–4785. IEEE (2006)

    Google Scholar 

  43. Laurentini, A.: The visual hull concept for silhouette-based image understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence 16(2), 150–162 (1994)

    Article  Google Scholar 

  44. Lee, T., Lim, S., Lee, S., An, S., Oh, S.: Indoor mapping using planes extracted from noisy rgb-d sensors. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2012)

    Google Scholar 

  45. Lenzen, F., Schäfer, H., Garbe, C.: Denoising time-of-flight data with adaptive total variation. In: Bebis, G. (ed.) ISVC 2011, Part I. LNCS, vol. 6938, pp. 337–346. Springer, Heidelberg (2011)

    Google Scholar 

  46. Leyvand, T., Meekhof, C., Wei, Y.C., Sun, J., Guo, B.: Kinect identity: Technology and experience. Computer 44(4), 94–96 (2011)

    Google Scholar 

  47. Lysenkov, I., Eruhimov, V.: Pose refinement of transparent rigid objects with a stereo camera. In: 22th International Conference on Computer Graphics and Vision (GraphiCon 2012) (2012)

    Google Scholar 

  48. Lysenkov, I., Eruhimov, V., Bradski, G.: Recognition and pose estimation of rigid transparent objects with a kinect sensor. In: Robotics: Science and Systems VIII, Sydney, Australia (2012)

    Google Scholar 

  49. Mac Aodha, O., Campbell, N.D.F., Nair, A., Brostow, G.J.: Patch based synthesis for single depth image super-resolution. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 71–84. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  50. Maimone, A., Fuchs, H.: Reducing interference between multiple structured light depth sensors using motion. In: 2012 IEEE Virtual Reality Workshops (VR), pp. 51–54. IEEE (2012)

    Google Scholar 

  51. Mardia, K., Dryden, I.: The statistical analysis of shape data. Biometrika 76(2), 271–281 (1989)

    Google Scholar 

  52. Meister, S., Izadi, S., Kohli, P., Hämmerle, M., Rother, C., Kondermann, D.: When can we use kinectfusion for ground truth acquisition? In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Workshops & Tutorials (2012)

    Google Scholar 

  53. Microsoft Corporation: Kinect for windows sdk, http://www.microsoft.com/enus/kinectforwindows/

  54. Microsoft News Center: Microsoft press release (March 2010), http://www.microsoft.com/en-us/news/press/2010/mar10/03-31PrimeSensePR.aspx

  55. Microsoft Xbox support: Room lighting conditions for kinect, http://support.xbox.com/en-US/xbox-360/kinect/lighting/

  56. Nair, R., Lenzen, F., Meister, S., Schäfer, H., Garbe, C., Kondermann, D.: High accuracy TOF and stereo sensor fusion at interactive rates. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012 Ws/Demos, Part II. LNCS, vol. 7584, pp. 1–11. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  57. Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: KinectFusion: Real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, vol. 7, pp. 127–136 (2011)

    Google Scholar 

  58. Newman, P., Ho, K.: Slam-loop closing with visually salient features. In: Proceedings of the 2005 IEEE International Conference on Robotics and Automation (ICRA 2005), pp. 635–642. IEEE (2005)

    Google Scholar 

  59. Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3D reconstruction and tracking. In: Second International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), pp. 524–530. IEEE (2012)

    Google Scholar 

  60. Nowozin, S., Rother, C., Bagon, S., Sharp, T., Yao, B., Kohli, P.: Decision tree fields. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 1668–1675. IEEE (2011)

    Google Scholar 

  61. Oikonomidis, I., Kyriazis, N., Argyros, A.: Efficient model-based 3d tracking of hand articulations using kinect. BMVC (August 2, 2011)

    Google Scholar 

  62. Openkinect Project: libfreenect, http://openkinect.org/

  63. OpenNI: Openni framework, http://www.openni.org

  64. Raheja, J.L., Chaudhary, A., Singal, K.: Tracking of fingertips and centers of palm using kinect. In: 2011 Third International Conference on Computational Intelligence, Modelling and Simulation (CIMSiM), pp. 248–252. IEEE (2011)

    Google Scholar 

  65. Raptis, M., Kirovski, D., Hoppe, H.: Real-time classification of dance gestures from skeleton animation. In: Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 147–156. ACM (2011)

    Google Scholar 

  66. Roth, H., Vona, M.: Moving volume kinectfusion. In: British Machine Vision Conf. (BMVC), Surrey, UK (2012)

    Google Scholar 

  67. Ruhl, K., Klose, F., Lipski, C., Magnor, M.: Integrating approximate depth data into dense image correspondence estimation. In: Proceedings of the 9th European Conference on Visual Media Production, pp. 26–31. ACM (2012)

    Google Scholar 

  68. Schoner, H., Moser, B., Dorrington, A.A., Payne, A.D., Cree, M.J., Heise, B., Bauer, F.: A clustering based denoising technique for range images of time of flight cameras. In: 2008 International Conference on Computational Intelligence for Modelling Control & Automation, pp. 999–1004. IEEE (2008)

    Google Scholar 

  69. Schröder, Y., Berger, K., Magnor, M.: Super resolution for active light sensor enhancement. Bachelor thesis, University of Braunschweig (March 2012)

    Google Scholar 

  70. Schröder, Y., Scholz, A., Berger, K., Ruhl, K., Guthe, S., Magnor, M.: Multiple kinect studies. Computer Graphics (2011)

    Google Scholar 

  71. Schnauer, C., Kaufmann, H.: Wide area motion tracking using consumer hardware. In: Proceedings of Workshop on Whole Body Interaction in Games and Entertainment, Advances in Computer Entertainment Technology (ACE 2011), Lisbon, Portugal (2011)

    Google Scholar 

  72. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1297–1304. IEEE (2011)

    Google Scholar 

  73. Smisek, J., Jancosek, M., Pajdla, T.: 3d with kinect. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 1154–1160. IEEE (2011)

    Google Scholar 

  74. Somanath, G., Cohen, S., Price, B., Kambhamettu, C.: Stereo+Kinect for High Resolution Stereo Correspondences. In: International Conference on 3D Vision 2013 (3DV) (2013)

    Google Scholar 

  75. Steinbrücker, F., Sturm, J., Cremers, D.: Real-time visual odometry from dense rgb-d images. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 719–722. IEEE (2011)

    Google Scholar 

  76. Stuckler, J., Behnke, S.: Integrating depth and color cues for dense multi-resolution scene mapping using rgb-d cameras. In: 2012 IEEE Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 162–167. IEEE (2012)

    Google Scholar 

  77. Stückler, J., Behnke, S.: Multi-resolution surfel maps for efficient dense 3D modeling and tracking. Journal of Visual Communication and Image Representation (2013)

    Google Scholar 

  78. Tam, G., Cheng, Z.Q., Lai, Y.K., Langbein, F., Liu, Y., Marshall, A., Martin, R., Sun, X.F., Rosin, P.: Registration of 3d point clouds and meshes: A survey from rigid to non-rigid. IEEE Transactions on Visualization and Computer Graphics PP(99), 1 (2012)

    Google Scholar 

  79. Tomasi, C., Manduchi, R.: Bilateral filtering for gray and color images. In: Sixth International Conference on Computer Vision, pp. 839–846. IEEE (1998)

    Google Scholar 

  80. Wang, F., Zhang, C.: Feature extraction by maximizing the average neighborhood margin. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2007), pp. 1–8. IEEE (2007)

    Google Scholar 

  81. Weickert, J.: Anisotropic diffusion in image processing, vol. 1. Teubner Stuttgart (1998)

    Google Scholar 

  82. Whelan, T., Johannsson, H., Kaess, M., Leonard, J.J., McDonald, J.: Robust real-time visual odometry for dense rgb-d mapping. In: IEEE Intl. Conf. on Robotics and Automation (ICRA), Karlsruhe, Germany (2013)

    Google Scholar 

  83. Whelan, T., Kaess, M., Fallon, M., Johannsson, H., Leonard, J., McDonald, J.: Kintinuous: Spatially extended kinectfusion. Technical Report MIT-CSAIL-TR-2012-020, CSAIL Technical Reports (2012), http://hdl.handle.net/1721.1/71756

  84. Woodford, O., Torr, P., Reid, I., Fitzgibbon, A.: Global stereo reconstruction under second-order smoothness priors. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(12), 2115–2128 (2009)

    Article  Google Scholar 

  85. Xu, K., Zhou, J., Wang, Z.: A method of hole-filling for the depth map generated by kinect with moving objects detection. In: 2012 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), pp. 1–5. IEEE (2012)

    Google Scholar 

  86. Yang, C., Medioni, G.: Object modelling by registration of multiple range images. Image and Vision Computing 10(3), 145–155 (1992)

    Article  Google Scholar 

  87. Zeng, M., Zhao, F., Zheng, J., Liu, X.: A memory-efficient kinectFusion using octree. In: Hu, S.-M., Martin, R.R. (eds.) CVM 2012. LNCS, vol. 7633, pp. 234–241. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  88. Zhu, J., Wang, L., Yang, R., J., Davis, J., et al.: Reliability fusion of time-of-flight depth and stereo for high quality depth maps. TPAMI (99), 1 (2011)

    Google Scholar 

  89. Zollhöfer, M., Martinek, M., Greiner, G., Stamminger, M., Süßmuth, J.: Automatic reconstruction of personalized avatars from 3D face scans. Computer Animation and Virtual Worlds 22(2-3), 195–202 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Berger, K., Meister, S., Nair, R., Kondermann, D. (2013). A State of the Art Report on Kinect Sensor Setups in Computer Vision. In: Grzegorzek, M., Theobalt, C., Koch, R., Kolb, A. (eds) Time-of-Flight and Depth Imaging. Sensors, Algorithms, and Applications. Lecture Notes in Computer Science, vol 8200. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-44964-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-44964-2_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-44963-5

  • Online ISBN: 978-3-642-44964-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics