ABSTRACT
It has been shown from various functional neuroimaging studies that sparsity-regularized dictionary learning could achieve superior performance in decomposing comprehensive and neuroscientifically meaningful functional networks from massive fMRI signals. However, the computational cost for solving the dictionary learning problem has been known to be very demanding, especially when dealing with large-scale data sets. Thus in this work, we propose a novel distributed rank-1 dictionary learning (D-r1DL) model and apply it for fMRI big data analysis. The model estimates one rank-1 basis vector with sparsity constraint on its loading coefficient from the input data at each learning step through alternating least squares updates. By iteratively learning the rank-1 basis and deflating the input data at each step, the model is then capable of decomposing the whole set of functional networks. We implement and parallelize the rank-1 dictionary learning algorithm using Spark engine and deployed the resilient distributed dataset (RDDs) abstracts for the data distribution and operations. Experimental results from applying the model on the Human Connectome Project (HCP) data show that the proposed D-r1DL model is efficient and scalable towards fMRI big data analytics, thus enabling data-driven neuroscientific discovery from massive fMRI big data in the future.
- https://aws.amazon.com/ec2/Google Scholar
- http://hafni.cs.uga.eduGoogle Scholar
- https://spark.apache.orgGoogle Scholar
- Aharon, M., Elad, M., and Bruckstein, A., 2006. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. Signal Processing, IEEE Transactions on 54, 11, 4311--4322. DOI= http://dx.doi.org/10.1109/TSP.2006.881199. Google ScholarDigital Library
- Biswal, B.B. and Ulmer, J.L., 1999. Blind Source Separation of Multiple Signal Sources of fMRI Data Sets Using Independent Component Analysis. Journal of Computer Assisted Tomography 23, 2, 265--271.Google ScholarCross Ref
- D'aspremont, A., Ghaoui, L.E., Jordan, M.I., and Lanckreit, G.R., 2004. A Direct Formulation for Sparse PCA Using Semidefinite Programming. In Advances in Neural Information Processing Systems. Google ScholarDigital Library
- Damoiseaux, J.S., Rombouts, S.A.R.B., Barkhof, F., Scheltens, P., Stam, C.J., Smith, S.M., and Beckmann, C.F., 2006. Consistent resting-state networks across healthy subjects. Proceedings of the National Academy of Sciences of the United States of America 103, 37, 02/20/received), 13848--13853. DOI= http://dx.doi.org/10.1073/pnas.0601417103.Google Scholar
- Elad, M. and Aharon, M., 2006. Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries. Image Processing, IEEE Transactions on 15, 12, 3736--3745. DOI= http://dx.doi.org/10.1109/TIP.2006.881969. Google ScholarDigital Library
- Glasser, M.F., Sotiropoulos, S.N., Wilson, J.A., Coalson, T.S., Fischl, B., Anderson, J.L., Xu, J., Jbabdi, S., Webster, M., Polimeni, J.R., Van Essen, D.C., and Jenkinson, M., 2013. The minimal preprocessing pipelines for the Human Connectome Project. NeuroImage 80, 105--124. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.neuroimage.2013.04.127.Google ScholarCross Ref
- Gonzalez, J.E., Xin, R.S., Dave A., Crankshaw, D., Franklin, M.J., and Stoica, I., 2014. GraphX: graph processing in a distributed dataflow framework. In Proceedings of the Proceedings of the 11th USENIX conference on Operating Systems Design and Implementation, USENIX Association, 2685096, 599--613. Google ScholarDigital Library
- Kangjoo, L., Sungho, T., and Jong Chul, Y., 2011. A Data-Driven Sparse GLM for fMRI Analysis Using Sparse Dictionary Learning With MDL Criterion. Medical Imaging, IEEE Transactions on 30, 5, 1076--1089. DOI= http://dx.doi.org/10.1109/TMI.2010.2097275.Google Scholar
- Lee, H., Battle, A., Raina, R., and NG, A.Y., 2006. Efficient sparse coding algorithms. In Advances in Neural Information Processing Systems. Google ScholarDigital Library
- Lin, B., Li, Q., Sun, Q., Lai, M.-J., Davidson, I., Fan, W., and Ye, J., 2014. Stochastic Coordinate Coding and Its Application for Drosophila Gene Expression Pattern Annotation. arXiv:1407.8147.Google Scholar
- Liu, B.-D., Wang, Y.-X., Zhang, Y.-J., and Shen, B., 2013. Learning dictionary on manifolds for image classification. Pattern Recognition 46, 7, 1879--1890. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.patcog.2012.11.018. Google ScholarDigital Library
- Lv, J., Jiang, X., Li, X., Zhu, D., Zhang, S., Zhao, S., Chen, H., Zhang, T., Hu, X., Han, J., Ye, J., Guo, L., and Liu, T., 2015. Holistic atlases of functional networks and interactions reveal reciprocal organizational architecture of cortical function. Biomedical Engineering, IEEE Transactions on 62, 4, 1120--1131. DOI= http://dx.doi.org/10.1109/TBME.2014.2369495.Google ScholarCross Ref
- Mackey, L.W., 2008. Deflation Methods for Sparse PCA. In Advances in Neural Information Processing Systems. Google ScholarDigital Library
- Makkie, M., Zhao, S., Jiang, X., Lv, J., Zhao, Y., Ge, B., Li, X., Han, J., and Liu, T., 2015. HAFNI-enabled largescale platform for neuroimaging informatics (HELPNI). Brain Informatics 2, 4, 225--238. DOI= http://dx.doi.org/10.1007/s40708-015-0024-0.Google ScholarCross Ref
- Mairal, J., Bach, F., Ponce, J., and Sapiro, G., 2010. Online Learning for Matrix Factorization and Sparse Coding. J. Mach. Learn. Res. 11, 19--60. Google ScholarDigital Library
- Mennes, M., Biswal, B.B., Castellanos, F.X., and Milham, M.P., 2013. Making data sharing work: The FCP/INDI experience. NeuroImage 82, 683--691. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.neuroimage.2012.10.064.Google ScholarCross Ref
- Poldrack, R.A., Barch, D.M., Mitchell, J.P., Wager, T.D., Wagner, A.D., Devlin, J.T., Cumba, C., Koyejo, O., and Milham, M.P., 2013. Toward open sharing of task-based fMRI data: the OpenfMRI project. Frontiers in Neuroinformatics 7, 12. DOI= http://dx.doi.org/10.3389/fninf.2013.00012.Google ScholarCross Ref
- Ravishankar, S. and Bresler, Y., 2011. MR Image Reconstruction From Highly Undersampled k-Space Data by Dictionary Learning. Medical Imaging, IEEE Transactions on 30, 5, 1028--1041. DOI= http://dx.doi.org/10.1109/TMI.2010.2090538.Google Scholar
- Sindhwani, V. and Ghoting, A., 2012. Large-scale distributed non-negative sparse coding and sparse dictionary learning. In Proceedings of the Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining 2339610, 489--497. DOI= http://dx.doi.org/10.1145/2339530.2339610. Google ScholarDigital Library
- Smith, S.M., Hyvarinen, A., Varoquaux, G., Miller, K.L., and Beckmann, C.F., 2014. Group-PCA for very large fMRI datasets. NeuroImage 101, 0, 738--749. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.neuroimage.2014.07.051.Google ScholarCross Ref
- Smith, S.M., Miller, K.L., Moeller, S., XU, J., Auerbach, E.J., Woolrich, M.W., Beckmann, C.F., Jenkinson, M., Andersson, J., Glasser, M.F., Van Essen, D.C., Feinberg, D.A., Yacoub, E.S., and Ugurbil, K., 2012. Temporally-independent functional modes of spontaneous brain activity. Proceedings of the National Academy of Sciences 109, 8, 3131--3136. DOI= http://dx.doi.org/10.1073/pnas.1121329109.Google ScholarCross Ref
- Thirion, B. and Faugeras, O., 2003. Dynamical components analysis of fMRI data through kernel PCA. NeuroImage 20, 1, 34--49. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/S1053--8119(03)00316--1.Google ScholarCross Ref
- Van Essen, D.C., Smith, S.M., Barch, D.M., Behrens, T.E.J., Yacoub, E., and Ugurbil, K., 2013. The WU-Minn Human Connectome Project: An overview. NeuroImage 80, 0, 62--79. DOI= http://dx.doi.org/http://dx.doi.org/10.1016/j.neuroimage.2013.05.041.Google ScholarCross Ref
- Zharia, M., Chowdhury, M., Das, T., Dave, A., MA, J., Mccauley, M., Franklin, M.J., Shenker, S., and Stoica, I., 2012. Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, 2228301, 2--2. Google ScholarDigital Library
- Zaharia, M., Das, T., Li, H., Hunter, T., Shenker, S., and Stoica, I., 2013. Discretized streams: fault-tolerant streaming computation at scale. In Proceedings of the Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, 2522737, 423--438. DOI= http://dx.doi.org/10.1145/2517349.2522737. Google ScholarDigital Library
Index Terms
- Scalable Fast Rank-1 Dictionary Learning for fMRI Big Data Analysis
Recommendations
Statistical parametric mapping of FMRI data using sparse dictionary learning
ISBI'10: Proceedings of the 2010 IEEE international conference on Biomedical imaging: from nano to MacroStatistical parametric mapping (SPM) of functional magnetic resonance imaging (fMRI) uses a canonical hemodynamic response function (HRF) to construct the design matrix within the general linear model (GLM) framework. Recently, there has been many ...
Scalable machine-learning algorithms for big data analytics: a comprehensive review
Big data analytics is one of the emerging technologies as it promises to provide better insights from huge and heterogeneous data. Big data analytics involves selecting the suitable big data storage and computational framework augmented by scalable ...
Comments