skip to main content
10.1145/3194658.3194671acmconferencesArticle/Chapter ViewAbstractPublication PagesdhConference Proceedingsconference-collections
short-paper

Learning Image-based Representations for Heart Sound Classification

Authors Info & Claims
Published:23 April 2018Publication History

ABSTRACT

Machine learning based heart sound classification represents an efficient technology that can help reduce the burden of manual auscultation through the automatic detection of abnormal heart sounds. In this regard, we investigate the efficacy of using the pre-trained Convolutional Neural Networks (CNNs) from large-scale image data for the classification of Phonocardiogram (PCG) signals by learning deep PCG representations. First, the PCG files are segmented into chunks of equal length. Then, we extract a scalogram image from each chunk using a wavelet transformation. Next, the scalogram images are fed into either a pre-trained CNN, or the same network fine-tuned on heart sound data. Deep representations are then extracted from a fully connected layer of each network and classification is achieved by a static classifier. Alternatively, the scalogram images are fed into an end-to-end CNN formed by adapting a pre-trained network via transfer learning. Key results indicate that our deep PCG representations extracted from a fine-tuned CNN perform the strongest, 56.2% mean accuracy, on our heart sound classification task. When compared to a baseline accuracy of 46.9%, gained using conventional audio processing features and a support vector machine, this is a significant relative improvement of 19.8% (p∠.001 by one-tailed z-test).

References

  1. Shahin Amiriparian, Maurice Gerczuk, Sandra Ottl, Nicholas Cummins, Michael Freitag, Sergey Pugachevskiy, Alice Baird, and Björn Schuller. 2018. Snore sound classification using image-based deep spectrum features Proc. INTERSPEECH. Stockholm, Sweden, 3512--3516.Google ScholarGoogle Scholar
  2. Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology Vol. 2, 3 (Apr.. 2011), 1--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Gari D. Clifford, Chengyu Liu, Benjamin Moody, David Springer, Ikaro Silva, Qiao Li, and Roger G. Mark. 2016. Classification of normal/abnormal heart sound recordings: The PhysioNet/Computing in Cardiology Challenge 2016. In Proc. Computing in Cardiology Conference (CinC). Vancouver, Canada, 609--612.Google ScholarGoogle Scholar
  4. Jun Deng, Nicholas Cummins, Jing Han, Xinzhou Xu, Zhao Ren, Vedhas Pandit, Zixing Zhang, and Björn Schuller. 2016. The University of Passau open emotion recognition system for the multimodal emotion challenge. In Proc. CCPR. Chengdu, China, 652--666.Google ScholarGoogle ScholarCross RefCross Ref
  5. Florian Eyben, Felix Weninger, Florian Groß, and Björn Schuller. 2013. Recent Developments in openSMILE, the Munich open-source multimedia feature extractor. In Proc. ACM Multimedia. Barcelona, Spain, 835--838. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Steve R. Gunn. 1998. Support vector machines for classification and regression. ISIS technical report Vol. 14, 1 (May. 1998), 5--16.Google ScholarGoogle Scholar
  7. Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-scale video classification with convolutional neural networks Proc. CVPR. Columbus, OH, 1725--1732. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Edmund Kay and Anurag Agarwal. 2016. DropConnected neural network trained with diverse features for classifying heart sounds Proc. Computing in Cardiology Conference (CinC). Vancouver, Canada, 617--620.Google ScholarGoogle Scholar
  9. Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014).Google ScholarGoogle Scholar
  10. Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks Proc. NIPS. Lake Tahoe, NV, 1097--1105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Aubrey Leatham. 1952. Phonocardiography. British Medical Bulletin Vol. 8, 4 (1952), 333--342.Google ScholarGoogle ScholarCross RefCross Ref
  12. Chengyu Liu et almbox.. 2016. An open access database for the evaluation of heart sound algorithms. Physiological Measurement Vol. 37, 12 (Nov.. 2016), 2181--2213.Google ScholarGoogle Scholar
  13. Ilias Maglogiannis, Euripidis Loukis, Elias Zafiropoulos, and Antonis Stasis. 2009. Support vectors machine-based identification of heart valve diseases using heart sounds. Computer Methods and Programs in Biomedicine Vol. 95, 1 (July. 2009), 47--61. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Vykintas Maknickas and Algirdas Maknickas. 2018. Recognition of normal--abnormal phonocardiographic signals using deep convolutional neural networks and mel-frequency spectral coefficients. Physiological Measurement Vol. 38, 8 (July. 2018), 1671--1679.Google ScholarGoogle Scholar
  15. Ali Moukadem, Alain Dieterlen, Nicolas Hueber, and Christian Brandt. 2013. A robust heart sounds segmentation module based on S-transform. Biomedical Signal Processing and Control Vol. 8, 3 (May. 2013), 273--281.Google ScholarGoogle ScholarCross RefCross Ref
  16. Dariush Mozaffarian et almbox.. 2016. Heart disease and stroke statistics--2016 update: A report from the American Heart Association. Circulation Vol. 133, 4 (Jan.. 2016), e38--e360.Google ScholarGoogle Scholar
  17. Sofia C. Olhede and Andrew T. Walden. 2002. Generalized morse wavelets. IEEE Transactions on Signal Processing Vol. 50, 11 (Nov.. 2002), 2661--2670. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Chrysa D. Papadaniil and Leontios J. Hadjileontiadis. 2014. Efficient heart sound segmentation and extraction using ensemble empirical mode decomposition and kurtosis features. IEEE Journal of Biomedical and Health Informatics Vol. 18, 4 (July. 2014), 1138--1152.Google ScholarGoogle ScholarCross RefCross Ref
  19. Kun Qian, Christoph Janott, Vedhas Pandit, Zixing Zhang, Clemens Heiser, Winfried Hohenhorst, Michael Herzog, Werner Hemmert, and Björn Schuller. 2018. Classification of the excitation location of snore sounds in the upper airway by acoustic multifeature analysis. IEEE Transactions on Biomedical Engineering Vol. 64, 8 (Aug.. 2018), 1731--1741.Google ScholarGoogle Scholar
  20. Kun Qian, Christoph Janott, Zixing Zhang, Clemens Heiser, and Björn Schuller. 2016. Wavelet features for classification of vote snore sounds Proc. ICASSP. Shanghai, China, 221--225.Google ScholarGoogle Scholar
  21. Zhao Ren, Vedhas Pandit, Kun Qian, Zijiang Yang, Zixing Zhang, and Björn Schuller. 2018. Deep sequential image features on acoustic scene classification Proc. DCASE Workshop. Munich, Germany, 113--117.Google ScholarGoogle Scholar
  22. Olivier Rioul and Martin Vetterli. 1991. Wavelets and signal processing. IEEE Signal Processing Magazine Vol. 8, 4 (Oct.. 1991), 14--38.Google ScholarGoogle ScholarCross RefCross Ref
  23. Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision Vol. 115, 3 (Dec.. 2015), 211--252. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Maryam Samieinasab and Reza Sameni. 2015. Fetal phonocardiogram extraction using single channel blind source separation Proc. ICEE. Tehran, Iran, 78--83.Google ScholarGoogle Scholar
  25. Samuel E. Schmidt, Claus Holst-Hansen, Claus Graff, Egon Toft, and Johannes J. Struijk. 2010 a. Segmentation of heart sound recordings by a duration-dependent hidden Markov model. Physiological Measurement Vol. 31, 4 (Mar.. 2010), 513--529.Google ScholarGoogle ScholarCross RefCross Ref
  26. Samuel E. Schmidt, Claus Holst-Hansen, John Hansen, Egon Toft, and Johannes J. Struijk. 2015. Acoustic features for the identification of coronary artery disease. IEEE Transactions on Biomedical Engineering Vol. 62, 11 (Nov.. 2015), 2611--2619.Google ScholarGoogle ScholarCross RefCross Ref
  27. Samuel E. Schmidt, Egon Toft, Claus Holst-Hansen, and Johannes J. Struijk. 2010 b. Noise and the detection of coronary artery disease with an electronic stethoscope Proc. CIBEC. Cairo, Egypt, 53--56.Google ScholarGoogle Scholar
  28. Björn Schuller, Stefan Steidl, Anton Batliner, Elika Bergelson, Jarek Krajewski, Christoph Janott, Andrei Amatuni, Marisa Casillas, Amdanda Seidl, Melanie Soderstrom, Anne Ss Warlaumont, Guillermo Hidalgo, Sebastian Schnieder, Clemens Heiser, Winfried Hohenhorst, Michael Herzog, Maximilian Schmitt, Kun Qian, Yue Zhang, George Trigeorgis, Panagiotis Tzirakis, and Stefanos Zafeiriou. 2018. The INTERSPEECH 2018 computational paralinguistics challenge: Addressee, cold & snoring Proc. INTERSPEECH. Stockholm, Sweden, 3442--3446.Google ScholarGoogle Scholar
  29. Björn Schuller, Stefan Steidl, Anton Batliner, Alessandro Vinciarelli, Klaus Scherer, Fabien Ringeval, Mohamed Chetouani, Felix Weninger, Florian Eyben, Erik Marchi, Marcello Mortillaro, Hugues Salamin, Anna Polychroniou, Fabio Valente, and Samuel Kim. 2013. The INTERSPEECH 2013 computational paralinguistics challenge: Social signals, conflict, emotion, autism. In Proc. INTERSPEECH. Lyon, France, 148--152.Google ScholarGoogle Scholar
  30. Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition Proc. ICLR. San Diego, CA, no pagination.Google ScholarGoogle Scholar
  31. Zeeshan Syed, Daniel Leeds, Dorothy Curtis, Francesca Nesta, Robert A. Levine, and John Guttag. 2007. A framework for the analysis of acoustical cardiac signals. IEEE Transactions on Biomedical Engineering Vol. 54, 4 (Apr.. 2007), 651--662.Google ScholarGoogle ScholarCross RefCross Ref
  32. Zeeshan Hassan Syed. 2003. MIT automated auscultation system. Ph.D. Dissertation. bibinfoschoolMassachusetts Institute of Technology.Google ScholarGoogle Scholar
  33. Hong Tang, Ting Li, Tianshuang Qiu, and Yongwan Park. 2012. Segmentation of heart sounds based on dynamic clustering. Biomedical Signal Processing and Control Vol. 7, 5 (Sep.. 2012), 509--516.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Learning Image-based Representations for Heart Sound Classification

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          DH '18: Proceedings of the 2018 International Conference on Digital Health
          April 2018
          172 pages
          ISBN:9781450364935
          DOI:10.1145/3194658

          Copyright © 2018 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 April 2018

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader