Abstract
Feature selection, as a data preprocessing strategy, has been proven to be effective and efficient in preparing data (especially high-dimensional data) for various data-mining and machine-learning problems. The objectives of feature selection include building simpler and more comprehensible models, improving data-mining performance, and preparing clean, understandable data. The recent proliferation of big data has presented some substantial challenges and opportunities to feature selection. In this survey, we provide a comprehensive and structured overview of recent advances in feature selection research. Motivated by current challenges and opportunities in the era of big data, we revisit feature selection research from a data perspective and review representative feature selection algorithms for conventional data, structured data, heterogeneous data and streaming data. Methodologically, to emphasize the differences and similarities of most existing feature selection algorithms for conventional data, we categorize them into four main groups: similarity-based, information-theoretical-based, sparse-learning-based, and statistical-based methods. To facilitate and promote the research in this community, we also present an open source feature selection repository that consists of most of the popular feature selection algorithms (http://featureselection.asu.edu/). Also, we use it as an example to show how to evaluate feature selection algorithms. At the end of the survey, we present a discussion about some open problems and challenges that require more attention in future research.
- Thomas Abeel, Thibault Helleputte, Yves Van de Peer, Pierre Dupont, and Yvan Saeys. 2010. Robust biomarker identification for cancer diagnosis with ensemble feature selection methods. Bioinformatics 26, 3 (2010), 392--398. Google ScholarDigital Library
- Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg, and Eric P. Xing. 2009. Mixed membership stochastic blockmodels. In NIPS. 33--40. Google ScholarDigital Library
- Salem Alelyani, Huan Liu, and Lei Wang. 2011. The effect of the characteristics of the dataset on the selection stability. In ICTAI. 970--977. Google ScholarDigital Library
- Salem Alelyani, Jiliang Tang, and Huan Liu. 2013. Feature selection for clustering: A review. Data Clustering: Algorithms and Applications 29 (2013).Google Scholar
- Jun Chin Ang, Andri Mirzal, Habibollah Haron, and Haza Nuzly Abdull Hamed. 2016. Supervised, unsupervised, and semi-supervised feature selection: A review on gene selection. IEEE/ACM TCBB 13, 5 (2016), 971--989. Google ScholarDigital Library
- Hiromasa Arai, Crystal Maung, Ke Xu, and Haim Schweitzer. 2016. Unsupervised feature selection by heuristic search with provable bounds on suboptimality. In AAAI. 666--672. Google ScholarDigital Library
- Francis R. Bach. 2008. Consistency of the group lasso and multiple kernel learning. J. Mach. Learn. Res. 9 (2008), 1179--1225. Google ScholarDigital Library
- Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: Predicting and recommending links in social networks. In WSDM. 635--644. Google ScholarDigital Library
- Roberto Battiti. 1994. Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Network. 5, 4 (1994), 537--550. Google ScholarDigital Library
- Mustafa Bilgic, Lilyana Mihalkova, and Lise Getoor. 2010. Active learning for networked data. In ICML. 79--86. Google ScholarDigital Library
- Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press. Google ScholarDigital Library
- Gavin Brown, Adam Pocock, Ming-Jie Zhao, and Mikel Luján. 2012. Conditional likelihood maximisation: A unifying framework for information-theoretic feature selection. J. Mach. Learn. Res. 13, 1 (2012), 27--66. Google ScholarDigital Library
- Deng Cai, Chiyuan Zhang, and Xiaofei He. 2010. Unsupervised feature selection for multi-cluster data. In KDD. 333--342. Google ScholarDigital Library
- Xiao Cai, Feiping Nie, and Heng Huang. 2013. Exact top-k feature selection via ℓ<sub>2,0</sub>-norm constraint. In IJCAI. 1240--1246. Google ScholarDigital Library
- Girish Chandrashekar and Ferat Sahin. 2014. A survey on feature selection methods. Comput. Electr. Eng. 40, 1 (2014), 16--28. Google ScholarDigital Library
- Xiaojun Chang, Feiping Nie, Yi Yang, and Heng Huang. 2014. A convex formulation for semi-supervised multi-label feature selection. In AAAI. 1171--1177. Google ScholarDigital Library
- Chen Chen, Hanghang Tong, Lei Xie, Lei Ying, and Qing He. 2016. FASCINATE: Fast cross-layer dependency inference on multi-layered networks. In KDD. 765--774. Google ScholarDigital Library
- Kewei Cheng, Jundong Li, and Huan Liu. 2016. FeatureMiner: A tool for interactive feature selection. In CIKM. 2445--2448. Google ScholarDigital Library
- Kewei Cheng, Jundong Li, and Huan Liu. 2017. Unsupervised feature selection in signed social networks. In KDD. 777--786. Google ScholarDigital Library
- Alexandre d’Aspremont, Laurent El Ghaoui, Michael I. Jordan, and Gert R. G. Lanckriet. 2007. A direct formulation for sparse PCA using semidefinite programming. SIAM Rev. 49, 3 (2007), 434--448. Google ScholarDigital Library
- John C. Davis and Robert J. Sampson. 1986. Statistics and Data Analysis in Geology. Vol. 646. Wiley. New York.Google Scholar
- Chris Ding, Ding Zhou, Xiaofeng He, and Hongyuan Zha. 2006. R 1-PCA: Rotational invariant -norm principal component analysis for robust subspace factorization. In ICML. 281--288. Google ScholarDigital Library
- Liang Du and Yi-Dong Shen. 2015. Unsupervised feature selection with adaptive structure learning. In KDD. 209--218. Google ScholarDigital Library
- Liang Du, Zhiyong Shen, Xuan Li, Peng Zhou, and Yi-Dong Shen. 2013. Local and global discriminative learning for unsupervised feature selection. In ICDM. 131--140.Google Scholar
- Richard O. Duda, Peter E. Hart, and David G. Stork. 2012. Pattern Classification. John Wiley 8 Sons.Google Scholar
- Janusz Dutkowski and Anna Gambin. 2007. On consensus biomarker selection. BMC Bioinform. 8, 5 (2007), S5.Google ScholarCross Ref
- Ali El Akadi, Abdeljalil El Ouardighi, and Driss Aboutajdine. 2008. A powerful feature selection approach based on mutual information. Int. J. Comput. Sci. Netw. Secur. 8, 4 (2008), 116.Google Scholar
- Jianqing Fan, Richard Samworth, and Yichao Wu. 2009. Ultrahigh dimensional feature selection: Beyond the linear model. J. Mach. Learn. Res. 10 (2009), 2013--2038. Google ScholarDigital Library
- Ahmed K. Farahat, Ali Ghodsi, and Mohamed S. Kamel. 2011. An efficient greedy method for unsupervised feature selection. In ICDM. 161--170. Google ScholarDigital Library
- Christiane Fellbaum. 1998. WordNet. Wiley Online Library.Google Scholar
- Yinfu Feng, Jun Xiao, Yueting Zhuang, and Xiaoming Liu. 2013. Adaptive unsupervised multi-view feature selection for visual concept recognition. In ACCV. 343--357. Google ScholarDigital Library
- François Fleuret. 2004. Fast binary feature selection with conditional mutual information. JMLR 5 (2004), 1531--1555. Google ScholarDigital Library
- Jerome Friedman, Trevor Hastie, and Robert Tibshirani. 2010. A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736 (2010).Google Scholar
- Keinosuke Fukunaga. 2013. Introduction to Statistical Pattern Recognition. Academic Press.Google ScholarDigital Library
- Shuyang Gao, Greg Ver Steeg, and Aram Galstyan. 2016. Variational information maximization for feature selection. In NIPS. 487--495. Google ScholarDigital Library
- C. W. Gini. 1912. Variability and mutability, contribution to the study of statistical distribution and relaitons. Studi Economico-Giuricici Della R (1912).Google Scholar
- David E. Golberg. 1989. Genetic algorithms in search, optimization, and machine learning. Addison-Wesley. Google ScholarDigital Library
- Quanquan Gu, Marina Danilevsky, Zhenhui Li, and Jiawei Han. 2012. Locality preserving feature learning. In AISTATS. 477--485.Google Scholar
- Quanquan Gu and Jiawei Han. 2011. Towards feature selection in network. In CIKM. 1175--1184. Google ScholarDigital Library
- Quanquan Gu, Zhenhui Li, and Jiawei Han. 2011a. Correlated multi-label feature selection. In CIKM. ACM, 1087--1096. Google ScholarDigital Library
- Quanquan Gu, Zhenhui Li, and Jiawei Han. 2011b. Generalized fisher score for feature selection. In UAI. 266--273. Google ScholarDigital Library
- Quanquan Gu, Zhenhui Li, and Jiawei Han. 2011c. Joint feature selection and subspace learning. In IJCAI. 1294--1299. Google ScholarDigital Library
- Baofeng Guo and Mark S. Nixon. 2009. Gait feature subset selection by mutual information. IEEE TMSC(A) 39, 1 (2009), 36--46. Google ScholarDigital Library
- Isabelle Guyon and André Elisseeff. 2003. An introduction to variable and feature selection. JMLR 3 (2003), 1157--1182. Google ScholarDigital Library
- Isabelle Guyon, Steve Gunn, Masoud Nikravesh, and Lofti A Zadeh. 2008. Feature Extraction: Foundations and Applications. Springer. Google Scholar
- Mark A. Hall and Lloyd A. Smith. 1999. Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper. In FLAIRS. 235--239. Google ScholarDigital Library
- Satoshi Hara and Takanori Maehara. 2017. Enumerate lasso solutions for feature selection. In AAAI. 1985--1991.Google Scholar
- Trevor Hastie, Robert Tibshirani, Jerome Friedman, and James Franklin. 2005. The elements of statistical learning: Data mining, inference and prediction. Math. Intell. 27, 2 (2005), 83--85.Google ScholarCross Ref
- Xiaofei He, Deng Cai, and Partha Niyogi. 2005. Laplacian score for feature selection. In NIPS. 507--514. Google ScholarDigital Library
- Zengyou He and Weichuan Yu. 2010. Stable feature selection for biomarker discovery. Comput. Biol. Chem. 34, 4 (2010), 215--225. Google ScholarDigital Library
- Chenping Hou, Feiping Nie, Dongyun Yi, and Yi Wu. 2011. Feature selection via joint embedding learning and sparse regression. In IJCAI. 1324--1329. Google ScholarDigital Library
- Xia Hu, Jiliang Tang, Huiji Gao, and Huan Liu. 2013. ActNeT: Active learning for networked texts in microblogging. In SDM. 306--314.Google Scholar
- Hao Huang, Shinjae Yoo, and S Kasiviswanathan. 2015. Unsupervised feature selection on data streams. In CIKM. 1031--1040. Google ScholarDigital Library
- Junzhou Huang, Tong Zhang, and Dimitris Metaxas. 2011. Learning with structured sparsity. J. Mach. Learn. Res. 12 (2011), 3371--3412. Google ScholarDigital Library
- Laurent Jacob, Guillaume Obozinski, and Jean-Philippe Vert. 2009. Group lasso with overlap and graph lasso. In ICML. 433--440. Google ScholarDigital Library
- Aleks Jakulin. 2005. Machine Learning Based on Attribute Interactions. Ph.D. Dissertation. Univerza v Ljubljani.Google Scholar
- Rodolphe Jenatton, Jean-Yves Audibert, and Francis Bach. 2011. Structured variable selection with sparsity-inducing norms. J. Mach. Learn. Res. 12 (2011), 2777--2824. Google ScholarDigital Library
- Rodolphe Jenatton, Julien Mairal, Francis R. Bach, and Guillaume R. Obozinski. 2010. Proximal methods for sparse hierarchical dictionary learning. In ICML. 487--494. Google ScholarDigital Library
- Ling Jian, Jundong Li, Kai Shu, and Huan Liu. 2016. Multi-label informed feature selection. In IJCAI. 1627--1633. Google ScholarDigital Library
- Yi Jiang and Jiangtao Ren. 2011. Eigenvalue sensitive feature selection. In ICML. 89--96. Google ScholarDigital Library
- Alexandros Kalousis, Julien Prados, and Melanie Hilario. 2007. Stability of feature selection algorithms: A study on high-dimensional spaces. Knowl. Inf. Syst. 12, 1 (2007), 95--116. Google ScholarDigital Library
- Seyoung Kim and Eric P Xing. 2009. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS Genet. 5, 8 (2009).Google Scholar
- Seyoung Kim and Eric P Xing. 2010. Tree-guided group lasso for multi-task regression with structured sparsity. In ICML. 543--550. Google ScholarDigital Library
- Kenji Kira and Larry A. Rendell. 1992. A practical approach to feature selection. In ICML Workshop. 249--256. Google ScholarDigital Library
- Ron Kohavi and George H. John. 1997. Wrappers for feature subset selection. Artif. Intell. 97, 1 (1997), 273--324. Google ScholarDigital Library
- Daphne Koller and Mehran Sahami. 1995. Toward optimal feature selection. In ICML. 284--292. Google ScholarDigital Library
- Gert R. G. Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, and Michael I. Jordan. 2004. Learning the kernel matrix with semidefinite programming. J. Mach. Learn. Res. 5 (2004), 27--72. Google ScholarDigital Library
- David D. Lewis. 1992. Feature selection and feature extraction for text categorization. In Proceedings of the Workshop on Speech and Natural Language. 212--217. Google ScholarDigital Library
- Jundong Li, Harsh Dani, Xia Hu, and Huan Liu. 2017. Radar: Residual analysis for anomaly detection in attributed networks. In IJCAI. 2152--2158. Google ScholarDigital Library
- Jundong Li, Xia Hu, Ling Jian, and Huan Liu. 2016. Toward time-evolving feature selection on dynamic networks. In ICDM. 1003--1008.Google Scholar
- Jundong Li, Xia Hu, Jiliang Tang, and Huan Liu. 2015. Unsupervised streaming feature selection in social media. In CIKM. 1041--1050. Google ScholarDigital Library
- Jundong Li, Xia Hu, Liang Wu, and Huan Liu. 2016. Robust unsupervised feature selection on networked data. In SDM. 387--395.Google Scholar
- Jundong Li and Huan Liu. 2017. Challenges of feature selection for big data analytics. IEEE Intell. Syst. 32, 2 (2017), 9--15. Google ScholarDigital Library
- Jundong Li, Jiliang Tang, and Huan Liu. 2017a. Reconstruction-based unsupervised feature selection: An embedded approach. In IJCAI. 2159--2165. Google ScholarDigital Library
- Jundong Li, Liang Wu, Osmar R. Zaïane, and Huan Liu. 2017b. Toward personalized relational learning. In SDM. 444--452.Google Scholar
- Yifeng Li, Chih-Yu Chen, and Wyeth W. Wasserman. 2015. Deep feature selection: Theory and application to identify enhancers and promoters. In RECOMB. 205--217.Google Scholar
- Zechao Li, Yi Yang, Jing Liu, Xiaofang Zhou, and Hanqing Lu. 2012. Unsupervised feature selection using nonnegative spectral analysis. In AAAI. 1026--1032. Google ScholarDigital Library
- David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks. J. Assist Inf. Sci. Technol. 58, 7 (2007), 1019--1031. Google ScholarDigital Library
- Dahua Lin and Xiaoou Tang. 2006. Conditional infomax learning: An integrated framework for feature extraction and fusion. In ECCV. 68--82. Google ScholarDigital Library
- Hongfu Liu, Haiyi Mao, and Yun Fu. 2016a. Robust multi-view feature selection. In ICDM. 281--290.Google Scholar
- Huan Liu and Hiroshi Motoda. 2007. Computational Methods of Feature Selection. CRC Press. Google ScholarDigital Library
- Huan Liu and Rudy Setiono. 1995. Chi2: Feature selection and discretization of numeric attributes. In ICTAI. 388--391. Google ScholarDigital Library
- Hongfu Liu, Ming Shao, and Yun Fu. 2016b. Consensus guided unsupervised feature selection. In AAAI. 1874--1880. Google ScholarDigital Library
- Jun Liu, Shuiwang Ji, and Jieping Ye. 2009a. Multi-task feature learning via efficient ℓ<sub>2,0</sub>-norm minimization. In UAI. 339--348. Google ScholarDigital Library
- Jun Liu, Shuiwang Ji, and Jieping Ye. 2009b. SLEP: Sparse Learning with Efficient Projections. Arizona State University. Retrieved from http://www.public.asu.edu/∼jye02/Software/SLEP.Google Scholar
- Jun Liu and Jieping Ye. 2010. Moreau-Yosida regularization for grouped tree structure learning. In NIPS. 1459--1467. Google ScholarDigital Library
- Xinwang Liu, Lei Wang, Jian Zhang, Jianping Yin, and Huan Liu. 2014. Global and local structure preservation for feature selection. Trans. Neur. Netw. Learn. Syst. 25, 6 (2014), 1083--1095.Google ScholarCross Ref
- Bo Long, Zhongfei Mark Zhang, Xiaoyun Wu, and Philip S. Yu. 2006. Spectral clustering for multi-type relational data. In ICML. 585--592. Google ScholarDigital Library
- Bo Long, Zhongfei Mark Zhang, and Philip S Yu. 2007. A probabilistic framework for relational clustering. In KDD. 470--479. Google ScholarDigital Library
- Steven Loscalzo, Lei Yu, and Chris Ding. 2009. Consensus group stable feature selection. In KDD. 567--576. Google ScholarDigital Library
- Shuangge Ma, Xiao Song, and Jian Huang. 2007. Supervised group Lasso with applications to microarray data analysis. BMC Bioinf. 8, 1 (2007), 60.Google ScholarCross Ref
- Sofus A Macskassy and Foster Provost. 2007. Classification in networked data: A toolkit and a univariate case study. J. Mach. Learn. Res. 8 (2007), 935--983. Google ScholarDigital Library
- Peter V. Marsden and Noah E Friedkin. 1993. Network studies of social influence. Sociol. Methods Res. 22, 1 (1993), 127--151.Google ScholarCross Ref
- Mahdokht Masaeli, Yan Yan, Ying Cui, Glenn Fung, and Jennifer G. Dy. 2010. Convex principal feature selection. In SDM. 619--628.Google Scholar
- Crystal Maung and Haim Schweitzer. 2013. Pass-efficient unsupervised feature selection. In NIPS. 1628--1636. Google ScholarDigital Library
- James McAuley, Ji Ming, Darryl Stewart, and Philip Hanna. 2005. Subband correlation and robust speech recognition. IEEE Trans. Speech Audio Process. 13, 5 (2005), 956--964.Google ScholarCross Ref
- Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a feather: Homophily in social networks. Ann. Rev. Sociol. (2001), 415--444.Google Scholar
- Lukas Meier, Sara Van De Geer, and Peter Bühlmann. 2008. The group lasso for logistic regression. J. Roy. Stat. Soc. B 70, 1 (2008), 53--71.Google ScholarCross Ref
- Patrick E. Meyer and Gianluca Bontempi. 2006. On the use of variable complementarity for feature selection in cancer classification. In Applications of Evolutionary Computing. 91--102. Google ScholarDigital Library
- Patrick Emmanuel Meyer, Colas Schretter, and Gianluca Bontempi. 2008. Information-theoretic feature selection in microarray data using variable complementarity. IEEE J. Select. Top. Sign. Process. 2, 3 (2008), 261--274.Google ScholarCross Ref
- Patrenahalli M Narendra and Keinosuke Fukunaga. 1977. A branch and bound algorithm for feature subset selection. IEEE Trans. Comput. 100, 9 (1977), 917--922. Google ScholarDigital Library
- Michael Netzer, Gunda Millonig, Melanie Osl, Bernhard Pfeifer, Siegfried Praun, Johannes Villinger, Wolfgang Vogel, and Christian Baumgartner. 2009. A new ensemble-based algorithm for identifying breath gas marker candidates in liver disease using ion molecule reaction mass spectrometry. Bioinformatics 25, 7 (2009), 941--947. Google ScholarDigital Library
- Xuan Vinh Nguyen, Jeffrey Chan, Simone Romano, and James Bailey. 2014. Effective global approaches for mutual information based feature selection. In KDD. 512--521. Google ScholarDigital Library
- Feiping Nie, Heng Huang, Xiao Cai, and Chris H Ding. 2010. Efficient and robust feature selection via joint -norms minimization. In NIPS. 1813--1821. Google ScholarDigital Library
- Feiping Nie, Shiming Xiang, Yangqing Jia, Changshui Zhang, and Shuicheng Yan. 2008. Trace ratio criterion for feature selection. In AAAI. 671--676. Google ScholarDigital Library
- Feiping Nie, Wei Zhu, Xuelong Li, and others. 2016. Unsupervised feature selection with structured graph optimization. In AAAI. 1302--1308. Google ScholarDigital Library
- Guillaume Obozinski, Ben Taskar, and Michael Jordan. 2007. Joint Covariate Selection for Grouped Classification. Technical Report. Technical Report, Statistics Department, UC Berkeley.Google Scholar
- Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and others. 2011. Scikit-learn: Machine learning in python. J. Mach. Learn. Res. 12, Oct (2011), 2825--2830. Google ScholarDigital Library
- Hanyang Peng and Yong Fan. 2016. Direct sparsity optimization based feature selection for multi-class classification. In IJCAI. 1918--1924. Google ScholarDigital Library
- Hanyang Peng and Yong Fan. 2017. A general framework for sparsity regularized feature selection via iteratively reweighted least square minimization. In AAAI. 2471--2477.Google Scholar
- Hanchuan Peng, Fuhui Long, and Chris Ding. 2005. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27, 8 (2005), 1226--1238. Google ScholarDigital Library
- Jie Peng, Ji Zhu, Anna Bergamaschi, Wonshik Han, Dong-Young Noh, Jonathan R Pollack, and Pei Wang. 2010. Regularized multivariate regression for identifying master predictors with application to integrative genomics study of breast cancer. Ann. Appl. Stat. 4, 1 (2010), 53.Google ScholarCross Ref
- Simon Perkins, Kevin Lacker, and James Theiler. 2003. Grafting: Fast, incremental feature selection by gradient descent in function space. J. Mach. Learn. Res. 3 (2003), 1333--1356. Google ScholarDigital Library
- Simon Perkins and James Theiler. 2003. Online feature selection using grafting. In ICML. 592--599. Google ScholarDigital Library
- Mingjie Qian and Chengxiang Zhai. 2013. Robust unsupervised feature selection. In IJCAI. 1621--1627. Google ScholarDigital Library
- Ariadna Quattoni, Xavier Carreras, Michael Collins, and Trevor Darrell. 2009. An efficient projection for regularization. In ICML. 857--864. Google ScholarDigital Library
- Marko Robnik-Šikonja and Igor Kononenko. 2003. Theoretical and empirical analysis of relieff and rrelieff. Mach. Learn. 53, 1-2 (2003), 23--69. Google ScholarDigital Library
- Debaditya Roy, K Sri Rama Murty, and C Krishna Mohan. 2015. Feature selection using deep neural networks. In IJCNN. 1--6.Google Scholar
- Yvan Saeys, Thomas Abeel, and Yves Van de Peer. 2008. Robust feature selection using ensemble feature selection techniques. In ECMLPKDD (2008), 313--325.Google ScholarCross Ref
- Yvan Saeys, Iñaki Inza, and Pedro Larrañaga. 2007. A review of feature selection techniques in bioinformatics. Bioinformatics 23, 19 (2007), 2507--2517. Google ScholarDigital Library
- Ted Sandler, John Blitzer, Partha P. Talukdar, and Lyle H. Ungar. 2009. Regularized learning with networks of features. In NIPS. 1401--1408. Google ScholarDigital Library
- Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI Mag. 29, 3 (2008), 93.Google ScholarDigital Library
- Qiang Shen, Ren Diao, and Pan Su. 2012. Feature selection ensemble. Turing-100 10 (2012), 289--306.Google Scholar
- Jianbo Shi and Jitendra Malik. 2000. Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 8 (2000), 888--905. Google ScholarDigital Library
- Lei Shi, Liang Du, and Yi-Dong Shen. 2014. Robust spectral learning for unsupervised feature selection. In ICDM. 977--982. Google ScholarDigital Library
- Alexander Shishkin, Anastasia Bezzubtseva, Alexey Drutsa, Ilia Shishkov, Ekaterina Gladkikh, Gleb Gusev, and Pavel Serdyukov. 2016. Efficient high-order interaction-aware feature selection based on conditional mutual information. In NIPS. 4637--4645. Google ScholarDigital Library
- Sameer Singh, Jeremy Kubica, Scott Larsen, and Daria Sorokina. 2009. Parallel large scale feature selection for logistic regression. In SDM. 1172--1183.Google Scholar
- Mingkui Tan, Ivor W Tsang, and Li Wang. 2014. Towards ultrahigh dimensional feature selection for big data. J. Mach. Learn. Res. 15, 1 (2014), 1371--1429. Google ScholarDigital Library
- Jiliang Tang, Salem Alelyani, and Huan Liu. 2014. Feature selection for classification: A review. Data Classification: Algorithms and Applications (2014), 37.Google Scholar
- Jiliang Tang, Xia Hu, Huiji Gao, and Huan Liu. 2013. Unsupervised feature selection for multi-view data in social media. In SDM. 270--278.Google Scholar
- Jiliang Tang, Xia Hu, Huiji Gao, and Huan Liu. 2014. Discriminant analysis for unsupervised feature selection. In SDM. 938--946.Google Scholar
- Jiliang Tang and Huan Liu. 2012a. Feature selection with linked data in social media. In SDM. 118--128.Google Scholar
- Jiliang Tang and Huan Liu. 2012b. Unsupervised feature selection for linked social media data. In KDD. 904--912. Google ScholarDigital Library
- Jiliang Tang and Huan Liu. 2013. Coselect: Feature selection with instance selection for social media data. In SDM. 695--703.Google Scholar
- Lei Tang and Huan Liu. 2009. Relational learning via latent social dimensions. In KDD. 817--826. Google ScholarDigital Library
- Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. B (1996), 267--288.Google Scholar
- Robert Tibshirani, Michael Saunders, Saharon Rosset, Ji Zhu, and Keith Knight. 2005. Sparsity and smoothness via the fused lasso. J. Roy. Stat. Soc. B 67, 1 (2005), 91--108.Google ScholarCross Ref
- Robert Tibshirani, Guenther Walther, and Trevor Hastie. 2001. Estimating the number of clusters in a data set via the gap statistic. J. Roy. Stat. Soc. B 63, 2 (2001), 411--423.Google ScholarCross Ref
- William T. Vetterling, Saul A. Teukolsky, and William H. Press. 1992. Numerical Recipes: Example Book (C). Press Syndicate of the University of Cambridge. Google ScholarDigital Library
- Michel Vidal-Naquet and Shimon Ullman. 2003. Object recognition with informative features and linear classification. In ICCV. 281--288. Google ScholarDigital Library
- Hua Wang, Feiping Nie, and Heng Huang. 2013. Multi-view clustering and feature learning via structured sparsity. In ICML. 352--360. Google ScholarDigital Library
- Huan Wang, Shuicheng Yan, Dong Xu, Xiaoou Tang, and Thomas Huang. 2007. Trace ratio vs. ratio trace for dimensionality reduction. In CVPR. 1--8.Google Scholar
- Jie Wang and Jieping Ye. 2015. Multi-layer feature reduction for tree structured group lasso via hierarchical projection. In NIPS. 1279--1287. Google ScholarDigital Library
- Jialei Wang, Peilin Zhao, Steven C. H. Hoi, and Rong Jin. 2014b. Online feature selection and its applications. IEEE TKDE 26, 3 (2014), 698--710. Google ScholarDigital Library
- Qian Wang, Jiaxing Zhang, Sen Song, and Zheng Zhang. 2014a. Attentional neural network: Feature selection using cognitive feedback. In NIPS. 2033--2041. Google ScholarDigital Library
- Xiaokai Wei, Bokai Cao, and Philip S. Yu. 2016a. Nonlinear joint unsupervised feature selection. In SDM. 414--422.Google ScholarDigital Library
- Xiaokai Wei, Bokai Cao, and Philip S. Yu. 2016b. Unsupervised feature selection on networks: A generative view. In AAAI. 2215--2221. Google ScholarDigital Library
- Xiaokai Wei, Sihong Xie, and Philip S. Yu. 2015. Efficient partial order preserving unsupervised feature selection on networks. In SDM. 82--90.Google Scholar
- Xiaokai Wei and Philip S. Yu. 2016. Unsupervised feature selection by preserving stochastic neighbors. In AISTATS. 995--1003.Google ScholarDigital Library
- Liang Wu, Jundong Li, Xia Hu, and Huan Liu. 2017. Gleaning wisdom from the past: Early detection of emerging rumors in social media. In SDM. SIAM, 99--107.Google Scholar
- Xindong Wu, Kui Yu, Hao Wang, and Wei Ding. 2010. Online streaming feature selection. In ICML. 1159--1166. Google ScholarDigital Library
- Zhixiang Xu, Gao Huang, Kilian Q. Weinberger, and Alice X. Zheng. 2014. Gradient boosted feature selection. In KDD. 522--531. Google ScholarDigital Library
- Makoto Yamada, Avishek Saha, Hua Ouyang, Dawei Yin, and Yi Chang. 2014. N3LARS: Minimum redundancy maximum relevance feature selection for large and high-dimensional data. arXiv preprint arXiv:1411.2331 (2014).Google Scholar
- Feng Yang and K. Z. Mao. 2011. Robust feature selection for microarray data based on multicriterion fusion. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 4 (2011), 1080--1092. Google ScholarDigital Library
- Howard Hua Yang and John E. Moody. 1999. Data visualization and feature selection: New algorithms for nongaussian data. In NIPS. 687--693. Google ScholarDigital Library
- Sen Yang, Lei Yuan, Ying-Cheng Lai, Xiaotong Shen, Peter Wonka, and Jieping Ye. 2012. Feature grouping and selection over an undirected graph. In KDD. 922--930. Google ScholarDigital Library
- Yi Yang, Heng Tao Shen, Zhigang Ma, Zi Huang, and Xiaofang Zhou. 2011. ℓ<sub>2,0</sub>-norm regularized discriminative feature selection for unsupervised learning. In IJCAI. 1589--1594. Google ScholarDigital Library
- Yi Yang, Dong Xu, Feiping Nie, Shuicheng Yan, and Yueting Zhuang. 2010. Image clustering using local discriminant models and global integration. IEEE Trans. Inf. Process. 19, 10 (2010), 2761--2773. Google ScholarDigital Library
- Yee Hwa Yang, Yuanyuan Xiao, and Mark R. Segal. 2005. Identifying differentially expressed genes from microarray experiments via statistic synthesis. Bioinformatics 21, 7 (2005), 1084--1093. Google ScholarDigital Library
- Jieping Ye and Jun Liu. 2012. Sparse methods for biomedical data. ACM SIGKDD Explor. Newslett. 14, 1 (2012), 4--15. Google ScholarDigital Library
- Kui Yu, Xindong Wu, Wei Ding, and Jian Pei. 2014. Towards scalable and accurate online feature selection for big data. In ICDM. 660--669. Google ScholarDigital Library
- Lei Yu and Huan Liu. 2003. Feature selection for high-dimensional data: A fast correlation-based filter solution. In ICML. 856--863. Google ScholarDigital Library
- Stella X. Yu and Jianbo Shi. 2003. Multiclass spectral clustering. In ICCV. 313--319. Google ScholarDigital Library
- Lei Yuan, Jun Liu, and Jieping Ye. 2011. Efficient methods for overlapping group lasso. In NIPS. 352--360. Google ScholarDigital Library
- Ming Yuan and Yi Lin. 2006. Model selection and estimation in regression with grouped variables. J. Roy Stat. Soc. B 68, 1 (2006), 49--67.Google ScholarCross Ref
- Sepehr Abbasi Zadeh, Mehrdad Ghadiri, Vahab S. Mirrokni, and Morteza Zadimoghaddam. 2017. Scalable feature selection via distributed diversity maximization. In AAAI. 2876--2883.Google Scholar
- Jian Zhang, Zoubin Ghahramani, and Yiming Yang. 2008. Flexible latent variable models for multi-task learning. Mach. Learn. 73, 3 (2008), 221--242. Google ScholarDigital Library
- Miao Zhang, Chris H. Q. Ding, Ya Zhang, and Feiping Nie. 2014. Feature selection at the discrete limit. In AAAI. 1355--1361. Google ScholarDigital Library
- Qin Zhang, Peng Zhang, Guodong Long, Wei Ding, Chengqi Zhang, and Xindong Wu. 2015. Towards mining trapezoidal data streams. In ICDM. 1111--1116. Google ScholarDigital Library
- Lei Zhao, Qinghua Hu, and Wenwu Wang. 2015. Heterogeneous feature selection with multi-modal deep neural networks and sparse group lasso. IEEE Trans. Multimedia 17, 11 (2015), 1936--1948.Google ScholarDigital Library
- Peng Zhao, Guilherme Rocha, and Bin Yu. 2009. The composite absolute penalties family for grouped and hierarchical variable selection. The Annals of Statistics (2009), 3468--3497.Google Scholar
- Zhou Zhao, Xiaofei He, Deng Cai, Lijun Zhang, Wilfred Ng, and Yueting Zhuang. 2016. Graph regularized feature selection with data reconstruction. IEEE Trans. Knowl. Data Eng. 28, 3 (2016), 689--700. Google ScholarDigital Library
- Zheng Zhao and Huan Liu. 2007. Spectral feature selection for supervised and unsupervised learning. In ICML. 1151--1157. Google ScholarDigital Library
- Zheng Zhao and Huan Liu. 2008. Multi-source feature selection via geometry-dependent covariance analysis. In FSDM. 36--47. Google ScholarDigital Library
- Zheng Zhao, Lei Wang, Huan Liu, and others. 2010. Efficient spectral feature selection with minimum redundancy. In AAAI. 673--678. Google ScholarDigital Library
- Zheng Zhao, Ruiwen Zhang, James Cox, David Duling, and Warren Sarle. 2013. Massively parallel feature selection: An approach based on variance preservation. Mach. Learn. 92, 1 (2013), 195--220. Google ScholarDigital Library
- Jing Zhou, Dean Foster, Robert Stine, and Lyle Ungar. 2005. Streaming feature selection using alpha-investing. In KDD. 384--393. Google ScholarDigital Library
- Jiayu Zhou, Jun Liu, Vaibhav A Narayan, and Jieping Ye. 2012. Modeling disease progression via fused sparse group lasso. In KDD. 1095--1103. Google ScholarDigital Library
- Yao Zhou and Jingrui He. 2017. A randomized approach for crowdsourcing in the presence of multiple views. In ICDM.Google Scholar
- Zhi-Hua Zhou. 2012. Ensemble Methods: Foundations and Algorithms. CRC Press. Google ScholarCross Ref
- Ji Zhu, Saharon Rosset, Robert Tibshirani, and Trevor J. Hastie. 2004. 1-norm support vector machines. In NIPS. 49--56. Google ScholarDigital Library
- Pengfei Zhu, Qinghua Hu, Changqing Zhang, and Wangmeng Zuo. 2016. Coupled dictionary learning for unsupervised feature selection. In AAAI. 2422--2428. Google ScholarDigital Library
Index Terms
Feature Selection: A Data Perspective
Recommendations
A survey on online feature selection with streaming features
In the era of big data, the dimensionality of data is increasing dramatically in many domains. To deal with high dimensionality, online feature selection becomes critical in big data mining. Recently, online selection of dynamic features has received ...
Dimensionality Reduction: Is Feature Selection More Effective Than Random Selection?
Advances in Computational IntelligenceAbstractThe advent of Big Data has brought with it an unprecedented and overwhelming increase in data volume, not only in samples but also in available features. Feature selection, the process of selecting the relevant features and discarding the ...
General framework for class-specific feature selection
Commonly, when a feature selection algorithm is applied, a single feature subset is selected for all the classes, but this subset could be inadequate for some classes. Class-specific feature selection allows selecting a possible different feature subset ...
Comments