| Sequential cost-sensitive decision making with reinforcement learning |
| Full text |
Pdf
(1.06 MB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Edmonton, Alberta, Canada
SESSION: Learning methods
table of contents
Pages: 259 - 268
Year of Publication: 2002
ISBN:1-58113-567-X
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 14, Downloads (12 Months): 60, Citation Count: 5
|
|
|
ABSTRACT
Recently, there has been increasing interest in the issues of cost-sensitive learning and decision making in a variety of applications of data mining. A number of approaches have been developed that are effective at optimizing cost-sensitive decisions when each decision is considered in isolation. However, the issue of sequential decision making, with the goal of maximizing total benefits accrued over a period of time instead of immediate benefits, has rarely been addressed. In the present paper, we propose a novel approach to sequential decision making based on the reinforcement learning framework. Our approach attempts to learn decision rules that optimize a sequence of cost-sensitive decisions so as to maximize the total benefits accrued over time. We use the domain of targeted' marketing as a testbed for empirical evaluation of the proposed method. We conducted experiments using approximately two years of monthly promotion data derived from the well-known KDD Cup 1998 donation data set. The experimental results show that the proposed method for optimizing total accrued benefits out performs the usual targeted-marketing methodology of optimizing each promotion in isolation. We also analyze the behavior of the targeting rules that were obtained and discuss their appropriateness to the application domain.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
C. Apte , E. Bibelnieks , R. Natarajan , E. Pednault , F. Tipu , D. Campbell , B. Nelson, Segmentation-based modeling for advanced targeted marketing, Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, p.408-413, August 26-29, 2001, San Francisco, California
[doi> 10.1145/502512.502573]
|
| |
2
|
S. D. Bay. UCI KDD archive. Department of Information and Computer Sciences, University of California, Irvine, 2000. http://kdd.ics.uci.edu/.
|
| |
3
|
|
 |
4
|
|
| |
5
|
C. Elkan. The foundations of cost-sensitive learning. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Aug. 2001.
|
| |
6
|
|
| |
7
|
L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 1996.
|
| |
8
|
|
| |
9
|
R. Natarajan and E. Pednault. Segmented regression estimators for massive data sets. In Second SIAM International Conference on Data Mining, Arlington, Virginia, 2002. to appear.
|
| |
10
|
G. A. Rummery and M. Niranjan. On-line q-learning using connectionist systems. Technical Report CUED/F-INFENG/TR 166, Cambridge University Engineering Departement, 1994. Ph.D. thesis.
|
| |
11
|
|
| |
12
|
J. N. Tsitsiklis and B. V. Roy. An analysis of temporal difference learning with function approximation. IEEE Transactions on Automatic Control, 42(5):674--690, 1997.
|
| |
13
|
P. Turney. Cost-sensitive learning bibliography. Institute for Information Technology, National Research Council, Ottawa, Canada, 2000. http://extractor.iit.nrc.ca/bibliographies/cost-sensitive.html.
|
| |
14
|
X. Wang and T. Dietterich. Efficient value function approximation using regression trees. In Proceedings of the IJCAI Workshop on Statistical Machine Learning for Large-Scale Optimization, 1999.
|
| |
15
|
C. J. C. H. Watkins. Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, 1989.
|
| |
16
|
|
 |
17
|
|
CITED BY 7
|
Naoki Abe , Naval Verma , Chid Apte , Robert Schroko, Cross channel optimized marketing by reinforcement learning, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
An intelligent component database for behavioral synthesis
Proceedings of the 27th ACM/IEEE conference on Design automation
Gwo-Dong Chen
, Daniel D. Gajski
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
|