skip to main content
research-article
Public Access

Learning Local Shape Descriptors from Part Correspondences with Multiview Convolutional Networks

Published: 16 November 2017 Publication History

Abstract

We present a new local descriptor for 3D shapes, directly applicable to a wide range of shape analysis problems such as point correspondences, semantic segmentation, affordance prediction, and shape-to-scan matching. The descriptor is produced by a convolutional network that is trained to embed geometrically and semantically similar points close to one another in descriptor space. The network processes surface neighborhoods around points on a shape that are captured at multiple scales by a succession of progressively zoomed-out views, taken from carefully selected camera positions. We leverage two extremely large sources of data to train our network. First, since our network processes rendered views in the form of 2D images, we repurpose architectures pretrained on massive image datasets. Second, we automatically generate a synthetic dense point correspondence dataset by nonrigid alignment of corresponding shape parts in a large collection of segmented 3D models. As a result of these design choices, our network effectively encodes multiscale local context and fine-grained surface detail. Our network can be trained to produce either category-specific descriptors or more generic descriptors by learning from multiple shape categories. Once trained, at test time, the network extracts local descriptors for shapes without requiring any part segmentation as input. Our method can produce effective local descriptors even for shapes whose category is unknown or different from the ones used while training. We demonstrate through several experiments that our learned local descriptors are more discriminative compared to state-of-the-art alternatives and are effective in a variety of shape analysis applications.

Supplementary Material

MP4 File (tog37-1-a6-huang.mp4)

References

[1]
M. Ankerst, G. Kastenmüller, H.-P. Kriegel, and T. Seidl. 1999. 3D shape histograms for similarity search and classification in spatial databases. In Proceedings of the International Symposium on Advances in Spatial Databases. 207--226.
[2]
M. Aubry, U. Schlickewei, and D. Cremers. 2011. The wave kernel signature: A quantum mechanical approach to shape analysis. In 2011 IEEE International Conference on Computer Vision Workshops.
[3]
S. Belongie, J. Malik, and J. Puzicha. 2002. Shape matching and object recognition using shape contexts. In IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 4 (2002), 509--522.
[4]
L. Bo, X. Ren, and D. Fox. 2014. Learning hierarchical sparse features for RGB-(D) object recognition. The International Journal of Robotics Research 33, 4 (2014), 581--599.
[5]
F. Bogo, J. Romero, M. Loper, and M. J. Black. 2014. FAUST: Dataset and evaluation for 3D mesh registration. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14).
[6]
D. Boscaini, J. Masci, S. Melzi, M. M. Bronstein, U. Castellani, and P. Vandergheynst. 2015. Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks. In Proceedings of the Symposium on Geometry Processing (SGP’15). 13--23.
[7]
D. Boscaini, J. Masci, E. Rodol, and M. M. Bronstein. 2016. Learning shape correspondence with anisotropic convolutional neural networks. The Conference and Workshop on Neural Information Processing Systems (NIPS’16).
[8]
J. Bromley, I. Guyon, Y. Lecun, E. Sackinger, and R. Shah. 1994. Signature Verification using a Siamese Time Delay Neural Network. Advances in Neural Information Processing Systems 6. Morgan-Kaufmann. 737--744.
[9]
A. M. Bronstein, M. M. Bronstein, L. J. Guibas, and M. Ovsjanikov. 2011. Shape Google: Geometric words and expressions for invariant shape retrieval. ACM Transactions on Graphics 30, 1 (2011), 1:1--1:20.
[10]
A. X. Chang, T. A. Funkhouser, L. J. Guibas, P. Hanrahan, Q.-X. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, and F. Yu. 2015. ShapeNet: An information-rich 3D model repository. CoRR.
[11]
D.-Y. Chen, X.-P. Tian, Y.-T. Shen, and M. Ouhyoung. 2003. On visual similarity based 3D model retrieval. Computer Graphics Forum 22, 3 (2003), 223--232.
[12]
H. Fu, D. Cohen-Or, G. Dror, and A. Sheffer. 2008. Upright orientation of man-made objects. ACM Trans. Graph. 27, 3 (2008).
[13]
R. Gal and D. Cohen-Or. 2006. Salient geometric features for partial shape matching and similarity. ACM Transactions on Graphics 25, 1 (2006), 130--150.
[14]
K. Guo, D. Zou, and X. Chen. 2015. 3D mesh labeling via deep convolutional neural networks. ACM Transactions on Graphics 35, 1 (2015), 3:1--3:12.
[15]
R. Hadsell, S. Chopra, and Y. LeCun. 2006. Dimensionality reduction by learning an invariant mapping. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’06).
[16]
X. Han, T. Leung, Y. Jia, R. Sukthankar, and A. C. Berg. 2015. MatchNet: Unifying feature and metric learning for patch-based matching. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15).
[17]
Q.-X. Huang, H. Su, and L. Guibas. 2013. Fine-grained semi-supervised labeling of large shape collections. ACM Transactions on Graphics 32, 6 (2013), 190:1--190:10.
[18]
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. 2014. Caffe: Convolutional architecture for fast feature embedding. CoRR.
[19]
A. E. Johnson and M. Hebert. 1999. Using spin images for efficient object recognition in cluttered 3D scenes. In IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 5 (1999), 433--449.
[20]
E. Kalogerakis, M. Averkiou, S. Maji, and S. Chaudhuri. 2017. 3D shape segmentation with projective convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15).
[21]
E. Kalogerakis, A. Hertzmann, and K. Singh. 2010. Learning 3D mesh segmentation and labeling. ACM Transactions on Graphics 29, 4 (2010), 102:1--102:12.
[22]
M. Kazhdan, T. Funkhouser, and S. Rusinkiewicz. 2004. Symmetry descriptors and 3D shape matching. In Proceedings of the Symposium on Geometry Processing (SGP’04).
[23]
V. G. Kim, S. Chaudhuri, L. Guibas, and T. Funkhouser. 2014. Shape2Pose: Human-centric shape analysis. ACM Transactions on Graphics 33, 4 (2014), 120:1--120:12.
[24]
V. G. Kim, W. Li, N. J. Mitra, S. Chaudhuri, S. DiVerdi, and T. Funkhouser. 2013. Learning part-based templates from large collections of 3D shapes. ACM Transactions on Graphics 32, 4 (2013), 70:1--70:12.
[25]
D. P. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. CoRR.
[26]
A. Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. The Conference and Workshop on Neural Information Processing Systems (NIPS’12).
[27]
K. Lai, L. Bo, and D. Fox. 2014. Unsupervised feature learning for 3D scene labeling. In IEEE International Conference on Robotics and Automation (ICRA’14).
[28]
G. Lavoue. 2012. Combination of bag-of-words descriptors for robust partial shape retrieval. The Visual Computer 28, 9 (2012), 931--942.
[29]
R. Litman, A. Bronstein, M. Bronstein, and U. Castellani. 2014. Supervised learning of bag-of-features shape descriptors using sparse coding. Computer Graphics Forum 33, 5 (2014), 127--136.
[30]
Y. Liu, H. Zha, and H. Qin. 2006. Shape topics: A compact representation and new algorithms for 3D partial shape retrieval. IEEE Conference on Computer Vision and Pattern Recognition (CVPR’06).
[31]
J. Masci, D. Boscaini, M. Bronstein, and P. Vandergheynst. 2015. Geodesic convolutional neural networks on Riemannian manifolds. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 37--45.
[32]
D. Maturana and S. Scherer. 2015. 3D convolutional neural networks for landing zone detection from LiDAR. In IEEE International Conference on Robotics and Automation (ICRA’15).
[33]
F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, and M. M. Bronstein. 2017. Geometric deep learning on graphs and manifolds using mixture model CNNs. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).
[34]
M. Novotni and R. Klein. 2003. 3D Zernike descriptors for content based shape retrieval. The 8th ACM Symposium on Solid Modeling and Applications.
[35]
R. Ohbuchi and T. Furuya. 2010. Distance metric learning and feature combination for shape-based 3D model retrieval. Proc. 3DOR.
[36]
R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin. 2002. Shape distributions. ACM Transactions on Graphics 21, 4 (2002), 807--832.
[37]
M. Ovsjanikov, W. Li, L. Guibas, and N. J. Mitra. 2011. Exploration of continuous variability in collections of 3D shapes. ACM Transactions on Graphics 30, 4 (2011), 33:1--33:10.
[38]
C. R. Qi, H. Su, K. Mo, and L. J. Guibas. 2017. PointNet: Deep learning on point sets for 3D classification and segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).
[39]
C. R. Qi, H. Su, M. Niener, A. Dai, M. Yan, and L. J. Guibas. 2016. Volumetric and multi-view CNNs for object classification on 3D data. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 5648--5656.
[40]
E. Rodola, S. Bulo, T. Windheuser, M. Vestner, and D. Cremers. 2014. Dense non-rigid shape correspondence using random forests. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14).
[41]
E. Rodola, L. Cosmo, O. Litany, M. M. Bronstein, A. M. Bronstein, N. Audebert, A. Ben Hamza, A. Boulch, U. Castellani, M. N. Do, A.-D. Duong, T. Furuya, A. Gasparetto, Y. Hong, J. Kim, B. Le Saux, R. Litman, M. Masoumi, G. Minello, H.-D. Nguyen, V.-T. Nguyen, R. Ohbuchi, V.-K. Pham, T. V. Phan, M. Rezaei, A. Torsello, M.-T. Tran, Q.-T. Tran, B. Truong, L. Wan, and C. Zou. 2017. Deformable shape retrieval with missing parts. In Eurographics Workshop on 3D Object Retrieval (3DOR’17).
[42]
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252.
[43]
D. Saupe and D. V. Vranic. 2001. 3D model retrieval with spherical harmonics and moments. In Symposium on Pattern Recognition. 392--397.
[44]
M. Savva, F. Yu, Hao Su, M. Aono, B. Chen, D. Cohen-Or, W. Deng, Hang Su, S. Bai, X. Bai, N. Fish, J. Han, E. Kalogerakis, E. G. Learned-Miller, Y. Li, M. Liao, S. Maji, A. Tatsuma, Y. Wang, N. Zhang, and Z. Zhou. 2016. Large-scale 3D shape retrieval from shapenet core55. Eurographics Workshop on 3D Object Retrieval (3DOR’16).
[45]
L. Shapira, S. Shalom, A. Shamir, D. Cohen-Or, and H. Zhang. 2010. Contextual part analogies in 3D objects. International Journal of Computer Vision 89, 2--3 (2010), 309--326.
[46]
E. Simo-Serra, E. Trulls, L. Ferraz, I. Kokkinos, P. Fua, and F. Moreno-Noguer. 2015. Discriminative learning of deep convolutional feature point descriptors. In IEEE International Conference on Computer Vision (ICCV’15). 9.
[47]
K. Simonyan and A. Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. CoRR.
[48]
A. Sinha, J. Bai, and K. Ramani. 2016. Deep learning 3D shape surfaces using geometry images. European Conference on Computer Vision (ECCV’16).
[49]
R. Socher, B. Huval, B. Bhat, C. D. Manning, and A. Y. Ng. 2012. Convolutional-recursive deep learning for 3D object classification. The Conference and Workshop on Neural Information Processing Systems (NIPS’12). 656--664.
[50]
S. Song and J. Xiao. 2016. Deep sliding shapes for amodal 3d object detection in RGB-D images. European Conference on Computer Vision (ECCV’16).
[51]
O. Sorkine and M. Alexa. 2007. As-rigid-as-possible surface modeling. In Proceedings of the Symposium on Geometry Processing (SGP’07).
[52]
H. Su, S. Maji, E. Kalogerakis, and E. G. Learned-Miller. 2015. Multi-view convolutional neural networks for 3D shape recognition. In Proceedings of ICCV.
[53]
R. W. Sumner, J. Schmid, and M. Pauly. 2007. Embedded deformation for shape manipulation. ACM Trans. Graph. 26, 3 (2007).
[54]
F. Tombari, S. Salti, and L. Di Stefano. 2010. Unique signatures of histograms for local surface description. European Conference on Computer Vision (ECCV’10).
[55]
L. Wei, Q. Huang, D. Ceylan, E. Vouga, and H. Li. 2016. Dense human body correspondences using convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16).
[56]
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao. 2015. 3D shapenets: A deep representation for volumetric shapes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 1912--1920.
[57]
Y. Xian, B. Schiele, and Z. Akata. 2017. Zero-shot learning - The good, the bad and the ugly. CoRR (2017).
[58]
J. Xie, Y. Fang, F. Zhu, and E. Wong. 2015. Deepshape: Deep learned shape descriptor for 3D shape matching and retrieval. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15).
[59]
K. Xu, V. G. Kim, Q. Huang, N. Mitra, and E. Kalogerakis. 2016. Data-driven shape analysis and processing. In SIGGRAPH ASIA 2016 Courses (SA’16). ACM.
[60]
K. M. Yi, E. Trulls, V. Lepetit, and P. Fua. 2016. LIFT: Learned invariant feature transform. European Conference on Computer Vision (ECCV’16).
[61]
L. Yi, V. G. Kim, D. Ceylan, I.-C. Shen, M. Yan, H. Su, C. Lu, Q. Huang, A. Sheffer, and L. Guibas. 2016. A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics 35, 6 (2016), 210:1--210:12.
[62]
L. Yi, H. Su, X. Guo, and L. Guibas. 2017. Synchronized spectral CNN for 3D shape segmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).
[63]
A. Zeng, S. Song, M. Nießner, M. Fisher, J. Xiao, and T. Funkhouser. 2016. 3DMatch: Learning local geometric descriptors from RGB-D reconstructions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17).
[64]
E. Zhang, K. Mischaikow, and G. Turk. 2005. Feature-based surface parameterization and texture mapping. ACM Transactions on Graphics 24, 1 (2005), 1--27.

Cited By

View all
  • (2025)Point Cloud-Based Deep Learning in Industrial Production: A SurveyACM Computing Surveys10.1145/371585157:7(1-36)Online publication date: 27-Jan-2025
  • (2025)HybriDeformer: A hybrid deformation method for arbitrary 3D avatar controllingDisplays10.1016/j.displa.2024.10293687(102936)Online publication date: Apr-2025
  • (2024)Relation Constrained Capsule Graph Neural Networks for Non-Rigid Shape CorrespondenceACM Transactions on Intelligent Systems and Technology10.1145/368885115:6(1-26)Online publication date: 16-Aug-2024
  • Show More Cited By

Index Terms

  1. Learning Local Shape Descriptors from Part Correspondences with Multiview Convolutional Networks

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 37, Issue 1
      February 2018
      167 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/3151031
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 November 2017
      Accepted: 01 September 2017
      Revised: 01 August 2017
      Received: 01 May 2017
      Published in TOG Volume 37, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Local 3D shape descriptors
      2. convolutional networks
      3. shape matching

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)160
      • Downloads (Last 6 weeks)29
      Reflects downloads up to 02 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Point Cloud-Based Deep Learning in Industrial Production: A SurveyACM Computing Surveys10.1145/371585157:7(1-36)Online publication date: 27-Jan-2025
      • (2025)HybriDeformer: A hybrid deformation method for arbitrary 3D avatar controllingDisplays10.1016/j.displa.2024.10293687(102936)Online publication date: Apr-2025
      • (2024)Relation Constrained Capsule Graph Neural Networks for Non-Rigid Shape CorrespondenceACM Transactions on Intelligent Systems and Technology10.1145/368885115:6(1-26)Online publication date: 16-Aug-2024
      • (2024)Parents and Children: Distinguishing Multimodal Deepfakes from Natural ImagesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/366549721:1(1-23)Online publication date: 21-May-2024
      • (2024)Establishing the regression model of smoke-flow opacity for adaptive intelligent smoke identification using remote hyperspectral sensingRemote Sensing Technologies and Applications in Urban Environments IX10.1117/12.3033766(18)Online publication date: 1-Nov-2024
      • (2024)Neural Semantic Surface MapsComputer Graphics Forum10.1111/cgf.1500543:2Online publication date: 17-Apr-2024
      • (2024)Characteristic-Preserving Latent Space for Unpaired Cross-Domain Translation of 3D Point CloudsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.328792330:8(5212-5226)Online publication date: Aug-2024
      • (2024)Learning Implicit Functions for Dense 3D Shape Correspondence of Generic ObjectsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2022.323343146:3(1852-1867)Online publication date: Mar-2024
      • (2024)Diffusion 3D Features (Diff3F) Decorating Untextured Shapes with Distilled Semantic Features2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00430(4494-4504)Online publication date: 16-Jun-2024
      • (2024)Complementary pseudo multimodal feature for point cloud anomaly detectionPattern Recognition10.1016/j.patcog.2024.110761156:COnline publication date: 18-Nov-2024
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Full Access

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media