ACM Home Page
Please provide us with feedback. Feedback
Efficient model learning for dialog management
Full text PdfPdf (276 KB)
Source ACM SIGCHI/SIGART Human-Robot Interaction archive
Proceedings of the ACM/IEEE international conference on Human-robot interaction table of contents
Arlington, Virginia, USA
SESSION: Full papers table of contents
Pages: 65 - 72  
Year of Publication: 2007
ISBN:978-1-59593-617-2
Authors
Finale Doshi  CSAIL MIT, Cambridge, MA
Nicholas Roy  CSAIL MIT, Cambridge, MA
Sponsors
ACM: Association for Computing Machinery
SIGART: ACM Special Interest Group on Artificial Intelligence
SIGCHI: ACM Special Interest Group on Computer-Human Interaction
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 89,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
Save this Article to a Binder    Display Formats: BibTex  EndNote ACM Ref   
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1228716.1228726
What is a DOI?

ABSTRACT

Intelligent planning algorithms such as the Partially Observable Markov Decision Process (POMDP) have succeeded in dialog management applications [10, 11, 12] because they are robust to the inherent uncertainty of human interaction. Like all dialog planning systems, however, POMDPs require an accurate model of the user (e.g., what the user might say or want). POMDPs are generally specified using a large probabilistic model with many parameters. These parameters are difficult to specify from domain knowledge, and gathering enough data to estimate the parameters accurately a priori is expensive.In this paper, we take a Bayesian approach to learning the user model simultaneously with dialog manager policy. At the heart of our approach is an efficient incremental update algorithm that allows the dialog manager to replan just long enough to improve the current dialog policy given data from recent interactions. The update process has a relatively small computational cost, preventing long delays in the interaction. We are able to demonstrate a robust dialog manager that learns from interaction data, out-performing a hand-coded model in simulation and in a robotic wheelchair application.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
R. Dearden, N. Friedman, and D. Andre. Model based bayesian exploration. pages 150--159, 1999.
 
2
G. J. Gordon. Stable function approximation in dynamic programming. In Proceedings of the Twelfth International Conference on Machine Learning, San Francisco, CA, 1995. Morgan Kaufmann.
 
3
R. Jaulmes, J. Pineau, and D. Precup. Learning in non-stationary partially observable markov decision processes. Workshop on Non-Stationarity in Reinforcement Learning at the ECML, 2005.
 
4
 
5
A. Nilim and L. Ghaoui. Robustness in markov decision problems with uncertain transition matrices, 2004.
 
6
J. Pineau, G. Gordon, and S. Thrun. Point-based value iteration: An anytime algorithm for pomdps, 2003.
 
7
J. Pineau, N. Roy, and S. Thrun. A hierarchical approach to pomdp planning and execution. In Workshop on Hierarchy and Memory in Reinforcement Learning (ICML), June 2001.
 
8
L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--286, 1989.
 
9
M. Ravishankar. Efficient Algorithms for Speech Recognition. PhD thesis, Carnegie Mellon, 1996.
 
10
 
11
J. Williams and S. Young. Scaling up pomdps for dialogue management: The lhsummary pomdpla method. In Proceedings of the IEEE ASRU Workshop, 2005.
 
12
J. D. Williams, P. Poupart, and S. Young. Partially observable markov decision processes with continuous observations for dialogue management. In Proceedings of SIGdial Workshop on Discourse and Dialogue 2005, 2005.

Collaborative Colleagues:
Finale Doshi: colleagues
Nicholas Roy: colleagues