Statistical methods for spoken dialogue management (2013) .. by Blaise Thomson
Contents
1 Introduction … 1
1.1 Thesis Outline and Contributions … 3
References … 4
2 Dialogue System Theory … 7
2.1 Components of a Spoken Dialogue System … 7
2.1.1 Speech Recognition … 7
2.1.2 Spoken Language Understanding … 8
2.1.3 Decision Making … 10
2.1.4 Response Generation … 11
2.1.5 Extensions to the Dialogue Cycle … 11
2.2 User Simulation … 12
2.3 The Dialogue Manager … 13
2.3.1 Hand-Crafted Dialogue Management … 13
2.3.2 Partial Observability … 15
2.3.3 Policy Learning … 16
2.3.4 Partially Observable Markov Decision Processes … 17
References … 22
3 Maintaining State … 27
3.1 Bayesian Networks … 28
3.2 Bayesian Networks for Dialogue State … 29
3.2.1 Goal Dependencies … 30
3.2.2 The History Nodes … 31
3.2.3 Sub-Components for the User Act Nodes … 31
3.2.4 Remaining Nodes … 32
3.3 TOWNINFO States: An Example … 33
3.4 Factor Graphs … 34
3.5 Belief Propagation … 35
3.5.1 The Approximation … 36
3.5.2 The Aim … 37
3.5.3 The Calculation … 38
3.5.4 The Belief Propagation Algorithm … 39
3.6 Comparison to Previous Work … 40
3.7 The Loss of the Markov Property … 41
3.8 Limiting the Time-Slices … 41
3.9 Conclusion … 43
References … 43
4 Maintaining State: Optimisations … 45
4.1 Expectation Propagation … 45
4.2 k-Best Belief Propagation … 46
4.2.1 The Idea … 46
4.2.2 The New Update Equation … 47
4.2.3 Reducing Complexity … 48
4.2.4 Choosing the k-Best … 50
4.2.5 Related Work … 50
4.3 Grouped Belief Propagation … 51
4.4 Mostly Constant Factors … 51
4.5 Experimental Comparison of Inference Algorithms … 53
4.6 Conclusion … 55
References … 55
5 Policy Design … 57
5.1 Policy Learning Theory … 57
5.2 Summary Actions … 58
5.3 TOWNINFO Summary Acts: An Example … 59
5.4 Function Approximations for Dialogue Management … 61
5.5 TOWNINFO Function Approximations: An Example … 62
5.5.1 Grid-Based Features with No Parameter Tying … 62
5.5.2 Grid-Based Features with Parameter Tying … 63
5.6 Natural Actor Critic … 64
5.6.1 Sampling Methods … 66
5.7 Simulation … 67
5.8 TOWNINFO Learning: An Example … 68
5.9 Conclusion … 69
References … 69
6 Evaluation … 71
6.1 TOWNINFO Systems … 71
6.1.1 Hand-Crafted State Transitions … 72
6.1.2 Partially Observable State Transitions … 74
6.2 Simulated Comparison … 76
6.3 User Trial … 76
6.4 Evaluating the Effects of Semantic Errors … 78
6.5 Conclusion … 81
References … 81
7 Parameter Learning … 83
7.1 An Extended TOWNINFO System … 84
7.1.1 Extended TOWNINFO Dialogue State … 84
7.1.2 Extended TOWNINFO Summary Acts … 84
7.1.3 Extended TOWNINFO Summary Features … 86
7.2 Specialised User Act Factors … 87
7.3 Learning Dirichlet Distributions … 89
7.3.1 The Approximation and Target Functions … 90
7.3.2 Matching the Target Function … 93
7.3.3 The Algorithm … 94
7.4 Tied Dirichlet Distributions … 95
7.5 Parameter Learning for Re-Scoring Semantics … 98
7.5.1 Simulated Evaluation on TOWNINFO … 98
7.5.2 User Data Evaluation on TOWNINFO … 99
7.6 Parameter Learning for Improving the Dialogue Manager … 100
7.7 Conclusion … 101
References … 102
8 Conclusion … 103
Appendix A: Dialogue Acts Formats … 105
Appendix B: Proof of Grouped Loopy Belief Propagation … 107
Appendix C: Experimental Model for Testing Belief Updating Optimisations … 111
Appendix D: The Simulated Confidence Scorer … 113
Appendix E: Matching the Dirichlet Distribution … 115
Appendix F: Confidence Score Quality … 119
Author Biography … 133
Index … 135