skip to main content
10.1145/2939672.2939730acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

Scalable Fast Rank-1 Dictionary Learning for fMRI Big Data Analysis

Published:13 August 2016Publication History

ABSTRACT

It has been shown from various functional neuroimaging studies that sparsity-regularized dictionary learning could achieve superior performance in decomposing comprehensive and neuroscientifically meaningful functional networks from massive fMRI signals. However, the computational cost for solving the dictionary learning problem has been known to be very demanding, especially when dealing with large-scale data sets. Thus in this work, we propose a novel distributed rank-1 dictionary learning (D-r1DL) model and apply it for fMRI big data analysis. The model estimates one rank-1 basis vector with sparsity constraint on its loading coefficient from the input data at each learning step through alternating least squares updates. By iteratively learning the rank-1 basis and deflating the input data at each step, the model is then capable of decomposing the whole set of functional networks. We implement and parallelize the rank-1 dictionary learning algorithm using Spark engine and deployed the resilient distributed dataset (RDDs) abstracts for the data distribution and operations. Experimental results from applying the model on the Human Connectome Project (HCP) data show that the proposed D-r1DL model is efficient and scalable towards fMRI big data analytics, thus enabling data-driven neuroscientific discovery from massive fMRI big data in the future.

References

  1. https://aws.amazon.com/ec2/Google ScholarGoogle Scholar
  2. http://hafni.cs.uga.eduGoogle ScholarGoogle Scholar
  3. https://spark.apache.orgGoogle ScholarGoogle Scholar
  4. Aharon, M., Elad, M., and Bruckstein, A., 2006. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. Signal Processing, IEEE Transactions on 54, 11, 4311--4322. DOI= http://dx.doi.org/10.1109/TSP.2006.881199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Biswal, B.B. and Ulmer, J.L., 1999. Blind Source Separation of Multiple Signal Sources of fMRI Data Sets Using Independent Component Analysis. Journal of Computer Assisted Tomography 23, 2, 265--271.Google ScholarGoogle ScholarCross RefCross Ref
  6. D'aspremont, A., Ghaoui, L.E., Jordan, M.I., and Lanckreit, G.R., 2004. A Direct Formulation for Sparse PCA Using Semidefinite Programming. In Advances in Neural Information Processing Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Damoiseaux, J.S., Rombouts, S.A.R.B., Barkhof, F., Scheltens, P., Stam, C.J., Smith, S.M., and Beckmann, C.F., 2006. Consistent resting-state networks across healthy subjects. Proceedings of the National Academy of Sciences of the United States of America 103, 37, 02/20/received), 13848--13853. DOI= http://dx.doi.org/10.1073/pnas.0601417103.Google ScholarGoogle Scholar
  8. Elad, M. and Aharon, M., 2006. Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries. Image Processing, IEEE Transactions on 15, 12, 3736--3745. DOI= http://dx.doi.org/10.1109/TIP.2006.881969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Glasser, M.F., Sotiropoulos, S.N., Wilson, J.A., Coalson, T.S., Fischl, B., Anderson, J.L., Xu, J., Jbabdi, S., Webster, M., Polimeni, J.R., Van Essen, D.C., and Jenkinson, M., 2013. The minimal preprocessing pipelines for the Human Connectome Project. NeuroImage 80, 105--124. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.neuroimage.2013.04.127.Google ScholarGoogle ScholarCross RefCross Ref
  10. Gonzalez, J.E., Xin, R.S., Dave A., Crankshaw, D., Franklin, M.J., and Stoica, I., 2014. GraphX: graph processing in a distributed dataflow framework. In Proceedings of the Proceedings of the 11th USENIX conference on Operating Systems Design and Implementation, USENIX Association, 2685096, 599--613. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kangjoo, L., Sungho, T., and Jong Chul, Y., 2011. A Data-Driven Sparse GLM for fMRI Analysis Using Sparse Dictionary Learning With MDL Criterion. Medical Imaging, IEEE Transactions on 30, 5, 1076--1089. DOI= http://dx.doi.org/10.1109/TMI.2010.2097275.Google ScholarGoogle Scholar
  12. Lee, H., Battle, A., Raina, R., and NG, A.Y., 2006. Efficient sparse coding algorithms. In Advances in Neural Information Processing Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Lin, B., Li, Q., Sun, Q., Lai, M.-J., Davidson, I., Fan, W., and Ye, J., 2014. Stochastic Coordinate Coding and Its Application for Drosophila Gene Expression Pattern Annotation. arXiv:1407.8147.Google ScholarGoogle Scholar
  14. Liu, B.-D., Wang, Y.-X., Zhang, Y.-J., and Shen, B., 2013. Learning dictionary on manifolds for image classification. Pattern Recognition 46, 7, 1879--1890. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.patcog.2012.11.018. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Lv, J., Jiang, X., Li, X., Zhu, D., Zhang, S., Zhao, S., Chen, H., Zhang, T., Hu, X., Han, J., Ye, J., Guo, L., and Liu, T., 2015. Holistic atlases of functional networks and interactions reveal reciprocal organizational architecture of cortical function. Biomedical Engineering, IEEE Transactions on 62, 4, 1120--1131. DOI= http://dx.doi.org/10.1109/TBME.2014.2369495.Google ScholarGoogle ScholarCross RefCross Ref
  16. Mackey, L.W., 2008. Deflation Methods for Sparse PCA. In Advances in Neural Information Processing Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Makkie, M., Zhao, S., Jiang, X., Lv, J., Zhao, Y., Ge, B., Li, X., Han, J., and Liu, T., 2015. HAFNI-enabled largescale platform for neuroimaging informatics (HELPNI). Brain Informatics 2, 4, 225--238. DOI= http://dx.doi.org/10.1007/s40708-015-0024-0.Google ScholarGoogle ScholarCross RefCross Ref
  18. Mairal, J., Bach, F., Ponce, J., and Sapiro, G., 2010. Online Learning for Matrix Factorization and Sparse Coding. J. Mach. Learn. Res. 11, 19--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Mennes, M., Biswal, B.B., Castellanos, F.X., and Milham, M.P., 2013. Making data sharing work: The FCP/INDI experience. NeuroImage 82, 683--691. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.neuroimage.2012.10.064.Google ScholarGoogle ScholarCross RefCross Ref
  20. Poldrack, R.A., Barch, D.M., Mitchell, J.P., Wager, T.D., Wagner, A.D., Devlin, J.T., Cumba, C., Koyejo, O., and Milham, M.P., 2013. Toward open sharing of task-based fMRI data: the OpenfMRI project. Frontiers in Neuroinformatics 7, 12. DOI= http://dx.doi.org/10.3389/fninf.2013.00012.Google ScholarGoogle ScholarCross RefCross Ref
  21. Ravishankar, S. and Bresler, Y., 2011. MR Image Reconstruction From Highly Undersampled k-Space Data by Dictionary Learning. Medical Imaging, IEEE Transactions on 30, 5, 1028--1041. DOI= http://dx.doi.org/10.1109/TMI.2010.2090538.Google ScholarGoogle Scholar
  22. Sindhwani, V. and Ghoting, A., 2012. Large-scale distributed non-negative sparse coding and sparse dictionary learning. In Proceedings of the Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining 2339610, 489--497. DOI= http://dx.doi.org/10.1145/2339530.2339610. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Smith, S.M., Hyvarinen, A., Varoquaux, G., Miller, K.L., and Beckmann, C.F., 2014. Group-PCA for very large fMRI datasets. NeuroImage 101, 0, 738--749. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.neuroimage.2014.07.051.Google ScholarGoogle ScholarCross RefCross Ref
  24. Smith, S.M., Miller, K.L., Moeller, S., XU, J., Auerbach, E.J., Woolrich, M.W., Beckmann, C.F., Jenkinson, M., Andersson, J., Glasser, M.F., Van Essen, D.C., Feinberg, D.A., Yacoub, E.S., and Ugurbil, K., 2012. Temporally-independent functional modes of spontaneous brain activity. Proceedings of the National Academy of Sciences 109, 8, 3131--3136. DOI= http://dx.doi.org/10.1073/pnas.1121329109.Google ScholarGoogle ScholarCross RefCross Ref
  25. Thirion, B. and Faugeras, O., 2003. Dynamical components analysis of fMRI data through kernel PCA. NeuroImage 20, 1, 34--49. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/S1053--8119(03)00316--1.Google ScholarGoogle ScholarCross RefCross Ref
  26. Van Essen, D.C., Smith, S.M., Barch, D.M., Behrens, T.E.J., Yacoub, E., and Ugurbil, K., 2013. The WU-Minn Human Connectome Project: An overview. NeuroImage 80, 0, 62--79. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.neuroimage.2013.05.041.Google ScholarGoogle ScholarCross RefCross Ref
  27. Zharia, M., Chowdhury, M., Das, T., Dave, A., MA, J., Mccauley, M., Franklin, M.J., Shenker, S., and Stoica, I., 2012. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, 2228301, 2--2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., and Stoica, I., 2013. Discretized streams: fault-tolerant streaming computation at scale. In Proceedings of the Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, 2522737, 423--438. DOI= http://dx.doi.org/10.1145/2517349.2522737. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Scalable Fast Rank-1 Dictionary Learning for fMRI Big Data Analysis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
        August 2016
        2176 pages
        ISBN:9781450342322
        DOI:10.1145/2939672

        Copyright © 2016 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 13 August 2016

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        KDD '16 Paper Acceptance Rate66of1,115submissions,6%Overall Acceptance Rate1,133of8,635submissions,13%

        Upcoming Conference

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader