research-article

Wrappers for web access logs feature selection

Authors:
Maria Muntean

University of Alba Iulia, Alba-Iulia, Romania

University of Alba Iulia, Alba-Iulia, Romania
View Profile

,
Honoriu Vălean

Technical University of Cluj Napoca, Cluj-Napoca, Romania

Technical University of Cluj Napoca, Cluj-Napoca, Romania
View Profile

WIMS '12: Proceedings of the 2nd International Conference on Web Intelligence, Mining and SemanticsJune 2012Article No.: 21Pages 1–7https://doi.org/10.1145/2254129.2254156

Published:13 June 2012Publication History

WIMS '12: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics

Pages 1–7

ABSTRACT

The Web Usage Mining (WUM), a rather recent research field, corresponds to the process of knowledge discovery from databases (KDD) applied to the Web usage data. The quantity of the Web usage data to be analyzed and its poor quality (in particular the abundance of features to be analyzed) are the main problems in WUM.

Considering the characteristics of Web log data and functions of every phase included in data preprocessing, this paper establishes a Web log data preprocessing algorithm based on feature selection. The implemented Wrapper Evaluation feature selection method use a Best First Search and a Greedy Stepwise Search and evaluate each of the attribute subsets according to Support Vector Machine learning scheme.

References

Chang-bin, J., Li, C. 2010. Web Log Data Preprocessing Based on Collaborative Filtering, 2010 Second International Workshop on Education Technology and Computer Science, Wuhan, China, ISBN: 978-1-4244-6388-6, pp. 118--121.Google ScholarCross Ref
Alam Ansari, S., Chattopadhayay, A., Das, S. 2010. A Kernel level VFS logger for building efficient file system Intrusion Detection System, Second International Conference on Computer and Network Technology, Bangkok, Thailand, ISBN: 978-0-7695-4042-9, pp.273--279, ACM doi 10.1109/ICCNT.2010.47. Google ScholarDigital Library
Hernandez, P., Garrigos, I., and Mazon, J.-N. 2010. Modeling Web logs to enhance the analysis of Web usage data, 2010 Workshops on Database and Expert Systems Applications, Bilbao, Spain, ISBN: 978-0-7695-4174-7, pp. 297--301, ACM doi 10.1109/DEXA.2010.65. Google ScholarDigital Library
Witten, I. H. and Frank, E. 2005. Data Mining, Practical Machine Learning Tools and Techniques, Morgan Kaufmann Publishers, Elsevier Inc., pp. 290. Google ScholarDigital Library
Mitchell, T. 1997. Machine Learning, The McGraw-Hill Companies, Inc., pp. 52--78. Google ScholarDigital Library
Sun, Y., Todorovic, S. and Goodison, S. 2010. Local-Learning-Based Feature Selection for High-Dimensional Data Analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 32, No. 9, September 2010, pp. 1610--1626. Google ScholarDigital Library
Kohavi, R. 1995. Wrappers for Performance Enhancement and Oblivious Decision Graphs, PhD thesis, Stanford University. Google ScholarDigital Library
Arlot, S., Celisse. 2010. A., A survey of cross-validation procedures for model selection, Statistics Surveys, Vol. 4 (2010) 40--79, ISSN: 1935--7516, pp.52.Google ScholarCross Ref
Vapnik, V N. 2000. The nature of statistical learning theory, New York: Springer-Verlag. Google ScholarDigital Library
Joachims, I. 1998. Text categorization with Support Vector Machines: Learning with many relevant features, Proceedings of the European Conference on Machine Learning, Berlin: Springer. Google ScholarDigital Library
Yang, X., Guan, H., Tang, F., You, I., Guo, M., Shen, Y. 2011. Improvements on Sequential Minimal Optimization Algorithm for Support Vector Machine based on Semi-sparse Algorithm, 2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, ISBN: 978-1-61284-733-7, Seoul, Korea, pp. 192--199. Google ScholarDigital Library
Lu, K., Wang, L. 2011. A Novel Nonlinear Combination Model Based on Support Vector Machine for Rainfall Prediction, 2011 Fourth International Joint Conference on Computational Sciences and Optimization, Kunming, Yunnan, China, ISBN: 978-1-4244-9712-6, pp. 1343--1346. Google ScholarDigital Library
Zhu, F., Ye, N., Pan, D., Ding, W. 2011. Incremental Support Vector Machine Learning: an Angle Approach, 2011 Fourth International Joint Conference on Computational Sciences and Optimization, Kunming, Yunnan, China, ISBN: 978-1-4244-9712-6, pp. 288--292. Google ScholarDigital Library
Morariu, D., Vintan, L., Tresp, V. 2006. Feature Selection Methods for an Improved SVM Classifier, Proceedings of 14th International Conference on of Intelligent Systems (ICIS06), ISSN: 1305--5313 Volume 14,(pp. 83--89), Prague.Google Scholar
J. Quinlan, 1993. C4.5: Programs for Machine Learning, Morgan Kaufmann. Google ScholarDigital Library
Han, J., Kamber, M., Data Mining: Concepts and Techniques, Second Edition, Morgan Kaufmann Press, Elsevier Inc., San Francisco, 2006, pp. 402. Google ScholarDigital Library

Index Terms

Wrappers for web access logs feature selection
1. Information systems
  1. World Wide Web
    1. Web applications
    2. Web services

Recommendations

Evolving Feature Selection

Feature selection is a preprocessing technique, commonly used on high-dimensional data, that studies how to select a subset or list of attributes or variables that are used to construct models describing data. Wide data sets, which have a huge number of ...
Read More
Dimensionality Reduction: Is Feature Selection More Effective Than Random Selection?
Advances in Computational Intelligence
Abstract
The advent of Big Data has brought with it an unprecedented and overwhelming increase in data volume, not only in samples but also in available features. Feature selection, the process of selecting the relevant features and discarding the ...
Read More
Hybrid feature selection by combining filters and wrappers

Feature selection aims at finding the most relevant features of a problem domain. It is very helpful in improving computational speed and prediction accuracy. However, identification of useful features from hundreds or even thousands of related features ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WIMS '12: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
June 2012
571 pages
ISBN:9781450309158
DOI:10.1145/2254129
Conference Chair:
Dumitru Dan Burdescu
University of Craiova, Romania
,
Program Chairs:
Rajendra Akerkar
Western Norway Research Institute, Norway
,
Costin Bădică
SUniversity of Craiova, Romania
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 13 June 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
accuracy
classification
feature selection
improvement
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate140of278submissions,50%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 134
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Wrappers for web access logs feature selection

WIMS '12: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Evolving Feature Selection

Dimensionality Reduction: Is Feature Selection More Effective Than Random Selection?

Hybrid feature selection by combining filters and wrappers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Wrappers for web access logs feature selection

WIMS '12: Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Evolving Feature Selection

Dimensionality Reduction: Is Feature Selection More Effective Than Random Selection?

Hybrid feature selection by combining filters and wrappers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media