Context-specific Consistencies in Information Extraction

Context-specific Consistencies in Information Extraction: Rule-based and Probabilistic Approaches (2015) .. by Peter Klügl (@pkluegl)

Contents

1 Introduction … 1
1.1 Motivation … 2
1.2 Goal … 4
1.3 Contributions … 4
1.4 Structure of this Work … 9

2 Information Extraction … 11
2.1 Foundations … 12
2.1.1 Definition … 12
2.1.2 Historical Development … 13
2.1.3 Evaluation Measures … 15
2.1.4 Architectures … 17
2.1.4.1 UIMA … 17
2.1.4.2 Other Architectures … 18
2.2 Rule-based Information Extraction … 19
2.2.1 Rule Languages … 19
2.2.1.1 CPSL … 20
2.2.1.2 JAPE … 21
2.2.1.3 SProUT – XTDL … 24
2.2.1.4 AFST … 27
2.2.1.5 SystemT – AQL … 29
2.2.1.6 Other Languages … 30
2.2.2 Development Support … 31
2.2.3 Rule Induction … 31
2.2.3.1 BWI … 31
2.2.3.2 CRYSTAL … 32
2.2.3.3 LP2 … 32
2.2.3.4 RAPIER … 32
2.2.3.5 SRV … 32
2.2.3.6 WHISK … 32
2.2.3.7 WIEN … 33
2.3 Machine Learning for Information Extraction … 33
2.3.1 Essentials of Machine Learning … 33
2.3.2 Representation as a Machine Learning Task … 35
2.3.2.1 Classify Candidates … 35
2.3.2.2 Sliding Window … 36
2.3.2.3 Boundary Models … 36
2.3.2.4 Finite State Machines … 36
2.3.2.5 Wrapper Induction … 37
2.3.3 Conditional Random Fields … 37
2.3.3.1 Modeling … 37
2.3.3.2 Inference … 39
2.3.3.3 Parameter Estimation … 41

3 Context-specific Consistencies … 43
3.1 Characteristics … 44
3.2 Domains … 46
3.2.1 Reference Sections … 46
3.2.1.1 Information Extraction Task … 47
3.2.1.2 Applications … 48
3.2.1.3 Related work … 48
3.2.1.4 Aspects of Context-specific Consistencies … 50
3.2.2 Curricula Vitae … 51
3.2.2.1 Information Extraction Task … 52
3.2.2.2 Applications … 53
3.2.2.3 Related work … 54
3.2.2.4 Aspects of Context-specific Consistencies … 55
3.2.3 Clinical Discharge Letters … 55
3.2.3.1 Information Extraction Task … 58
3.2.3.2 Applications … 58
3.2.3.3 Related work … 59
3.2.3.4 Aspects of Context-specific Consistencies … 60
3.2.4 Other Domains … 61
3.3 Exploiting Context-specific Consistencies … 62
3.4 Related Work … 63
3.4.1 Context-specific Consistencies … 63
3.4.1.1 Learning with Scope … 63
3.4.1.2 e RefParse Algorithm … 66
3.4.1.3 Properties-based Collective Inference … 67
3.4.1.4 Exploiting Content Redundancy … 70
3.4.1.5 Other publications … 71
3.4.2 Collective Information Extraction … 71

4 UIMA Ruta … 73
4.1 Introduction … 73
4.1.1 History and Current State … 74
4.2 e Rule-based Scripting Language … 75
4.2.1 Provided Annotation Types … 75
4.2.2 Syntax and Semantics … 75
4.2.2.1 Script Definition … 77
4.2.2.2 Rule Definition … 78
4.2.2.3 Extensible Language Definition … 82
4.2.3 Inference … 82
4.2.3.1 Rule Execution … 82
4.2.3.2 Rule Matching … 84
4.2.3.3 Beyond Sequential Matching … 88
4.2.4 Visibility and Filtering … 89
4.2.5 Blocks and Inlined Rules … 90
4.2.6 Engineering Approaches … 92
4.2.6.1 Classical Approaches … 92
4.2.6.2 Transformation-based Rules … 93
4.2.6.3 Scoring Rules … 93
4.2.7 Exemplary Script … 95
4.3 Development Environment and Tooling … 99
4.3.1 Basic Development Support … 100
4.3.2 Explanation of Rule Execution … 102
4.3.3 Introspection by Querying … 103
4.3.4 Automatic Validation … 104
4.3.5 Constraint-driven Evaluation … 106
4.3.6 Supervised Rule Induction … 109
4.3.7 Semi-automatic Creation of Gold Documents … 110
4.4 Comparison to Related Systems … 111

5 Knowledge Engineering Approaches … 117
5.1 Improving Recall in Precision-driven Prototyping … 118
5.1.1 Rule Sets … 118
5.1.2 Experimental Results … 120
5.2 Stacked Transformations … 121
5.2.1 Rule Sets … 122
5.2.2 Experimental Results … 123
5.3 Usage in a Complete Application … 125
5.3.1 Rule Sets … 126
5.3.1.1 Generating Candidates … 126
5.3.1.2 Properties of Headlines … 128
5.3.1.3 Score-based Approach … 128
5.3.1.4 Keyword-based Approach … 129
5.3.1.5 Consistency-based Approach … 130
5.3.1.6 Correction-based Approach … 130
5.3.2 Experimental Results … 130
5.4 Discussion … 133

6 Machine Learning Approaches … 135
6.1 Learning Context-specific Consistencies … 135
6.1.1 Modeling Consistencies with Classiers … 136
6.1.1.1 Determine Type of Description for Consistencies … 136
6.1.1.2 Select Classier for Learning Consistencies … 138
6.1.1.3 Provide Prediction of Entities … 140
6.1.1.4 Create Dataset for Classier … 141
6.1.1.5 Learn Classiers on Dataset … 142
6.1.1.6 Apply Classiers on Dataset … 142
6.1.2 Example … 143
6.1.3 Experimental Results … 145
6.1.3.1 Random Synthetic Errors … 146
6.1.3.2 Realistic Prediction … 153
6.2 Stacked Conditional Random Fields … 154
6.2.1 Stacked Inference with Consistencies … 155
6.2.2 Parameter Estimation … 157
6.2.3 Experimental Results … 158
6.2.3.1 Datasets … 158
6.2.3.2 Implementation Details … 159
6.2.3.3 Results … 159
6.3 Towards Higher-order Models … 161
6.3.1 Comb-chain CRFs … 162
6.3.2 Skyp-chain CRFs … 163
6.3.3 Parameter Estimation and Inference … 165
6.3.4 Experimental Results … 165
6.3.4.1 Datasets … 165
6.3.4.2 Settings … 166
6.3.4.3 Results … 166
6.4 Discussion … 166

7 Conclusion … 169
7.1 Summary … 169
7.2 Outlook … 173

Bibliography … 177