Integrating Expert Knowledge into POMDP Optimization for Spoken Dialog Systems

Jason D. Williams

A common problem for real-world POMDP applications is how to incorporate expert knowledge and constraints such as business rules into the optimization process. This paper describes a simple approach created in the course of developing a spoken dialog system. A POMDP and conventional handcrafted dialog controller run in parallel; the conventional dialog controller nominates a set of one or more actions, and the POMDP chooses the optimal action. This allows designers to express real-world constraints in a familiar manner, and also prunes the search space of policies. The method naturally admits compression, and the POMDP value function can draw on features from both the POMDP belief state and the hand-crafted dialog controller. The method has been used to build a full-scale dialog system which is currently running at AT&T Labs. An evaluation shows that this unified architecture yields better performance than using a conventional dialog manager alone, and also demonstrates an improvement in optimization speed and reliability vs. a pure POMDP.

Subjects: 12.1 Reinforcement Learning; 6. Computer-Human Interaction

Submitted: Apr 29, 2008

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.