Formal Grammar - Meta-Guide.com

Notes:

A grammar is a set of rules that describe the structure of a language and the way that its words can be combined to form meaningful sentences. It specifies the rules for using different parts of speech, such as nouns, verbs, adjectives, and so on, and the ways in which they can be combined to form phrases, clauses, and sentences. The grammar of a language is a fundamental part of its structure and is used to define the rules for creating and understanding sentences in that language.

A vocabulary is the set of words that are used in a particular language or domain, and it includes all of the words that a person knows and uses in their speaking and writing. A person’s vocabulary is an important part of their language ability, and it can have a significant impact on their ability to communicate effectively and to understand the language that they hear and read. A vocabulary can include a wide range of words from different parts of speech and different meanings, and it is typically organized in some way, such as alphabetically or by part of speech. A person’s vocabulary can also be influenced by their background, education, and interests, and it can change and grow over time as they learn new words and concepts.

A grammar and a vocabulary are two different, but closely related, components of a language. A grammar provides the rules for combining words to form sentences, while a vocabulary provides the words that can be used in those sentences. Together, a grammar and a vocabulary form the foundation of a language and are essential for understanding and using that language effectively. A person who has a strong knowledge of both the grammar and vocabulary of a language is likely to be able to communicate more effectively in that language and to understand it more fully.

A speech grammar is a specific type of grammar that is designed to analyze and interpret spoken language inputs. It is used in natural language processing and speech recognition systems to help these systems understand and respond to spoken language inputs. A speech grammar takes into account the unique characteristics of spoken language, such as pronunciation, rhythm, and intonation, and is often more flexible and adaptive than a written grammar.

In computational linguistics, a grammar is a set of rules and definitions that specify the structure and arrangement of words and phrases in a language. It is a way of representing the syntactic structure of a language in a formal, systematic way, and it is often used to analyze and understand natural language. There are many different types of grammars that have been developed in computational linguistics, including context-free grammars, regular grammars, and probabilistic grammars, among others. Grammars are an important tool in natural language processing and are used in a wide range of applications, such as speech recognition, machine translation, and text classification.

Grammars are an important tool in computational linguistics, because they provide a way of representing and analyzing the structure and meaning of natural language. They allow computational linguists to develop algorithms and models that can understand and interpret natural language inputs, and to provide appropriate responses or take appropriate actions based on that understanding. Grammars are particularly useful in natural language processing, where they are used to analyze and understand the structure and meaning of natural language inputs, such as text or speech. By using grammars, computational linguists are able to develop systems that can process and understand natural language inputs, and that can perform tasks such as translation, summarization, or text classification.

There are many different types of grammars, and they vary in their level of complexity and in the specific aspects of language that they attempt to capture. Some grammars are very simple and only capture the basic structure of sentences, such as subject-verb-object relationships, while others are much more complex and attempt to represent the full range of syntactic and semantic relationships in a language. Additionally, there are many different ways of encoding grammars, such as using formal notation systems or programming languages, and there are different approaches to representing and analyzing natural language, such as rule-based approaches, statistical approaches, and hybrid approaches that combine both rule-based and statistical techniques.

Cognition is a complex and multifaceted area of study, and our understanding of it is still incomplete. Researchers in a variety of fields are working to gain a better understanding of the mental processes and abilities that underlie our ability to think, perceive, and understand the world around us. Grammar plays an important role in cognition, because it is a fundamental part of language, and language is an important aspect of human cognition. It allows us to create and understand an infinite number of different sentences and expressions, and to express our thoughts, ideas, and intentions in a precise and effective way. In addition to its role in language, grammar also plays a role in other aspects of cognition, such as problem-solving and reasoning, and it helps us to understand and analyze the meaning of sentences and phrases, and to infer new information based on that meaning.

The meaning of words is based on their relationship with other words and concepts, and this is a fundamental aspect of semantics, which is the study of meaning in language. In semantics, words are analyzed in terms of their relationship to other words and to the concepts that they represent. This allows us to understand the meaning of words, and to use that meaning to communicate and reason about complex ideas. Semantics is an important field of study in linguistics, and it is concerned with the meaning of words, phrases, and sentences, as well as with the meaning of larger units of language, such as texts and conversations. It is closely related to other fields, such as philosophy and psychology, and it plays a central role in our ability to understand and use language effectively.

Semantics is an important field in linguistics, and it plays a central role in many other fields, including artificial intelligence, natural language processing, and cognitive science. In these fields, semantics is used to develop algorithms and models that can understand and interpret natural language inputs, and that can provide appropriate responses or take appropriate actions based on that understanding. By using techniques from semantics, researchers in these fields are able to develop systems that can understand and interpret the meaning of natural language inputs, such as text or speech, and that can perform tasks such as translation, summarization, or text classification. Semantics is an important aspect of natural language processing, and it is essential for the development of systems that can understand and interpret human language.

Cognition and semantics are closely related, because cognition involves the mental processes and abilities that underlie our ability to understand and use language, and semantics is the study of meaning in language. When we use language, we rely on our cognitive abilities to process and understand the meaning of the words and sentences that we hear or read, and to use that meaning to communicate and reason about complex ideas. Cognition and semantics are closely interdependent, and our ability to use language effectively depends on the mental processes and abilities that support cognition, as well as on our understanding of the meanings of words and phrases.

Units of Language

In the context of formal grammars, the units of language are the basic building blocks that are used to describe the structure of a language. These units can be classified into two main categories: lexical units and syntactic units.

Lexical units are the words of a language, which include nouns, verbs, adjectives, and other types of words that convey meaning. These words are typically used to describe the content of a language, and they are the building blocks of sentences.

Syntactic units, on the other hand, are the rules and structures that govern how the words of a language are combined to form sentences. These units include things like phrases, clauses, and sentences, and they are used to describe the structure and organization of a language.

In formal grammars, these units of language are typically described using a set of rules and symbols, which are used to specify the way in which the units can be combined to form correct sentences. These rules and symbols are known as a grammar, and they are used to define the syntax of a language.

Haskell Curry made a distinction between two levels of grammar, which he called “tectogrammatics” and “phenogrammatics.” Tectogrammatics is the study of grammatical structure in itself, while phenogrammatics is the study of how grammatical structure is represented in terms of expressions. This distinction is useful for understanding the different levels at which grammar can be studied and analyzed.

Tectogrammatics and phenogrammatics are terms used in linguistics to describe different levels of representation in language. Tectogrammatics refers to the abstract combinatorial structure of a grammar and how it directly informs semantics, while phenogrammatics deals with the concrete form of language, including how linguistic elements are combined and the order in which they are presented. Tectogrammatics is often represented using tree diagrams, while phenogrammatics is concerned with the linearization of the surface structure of a language.

Lexicon (or wordstock) of a language is its vocabulary, including its words and expressions. The lexicon is an important part of a language, and it is the collection of words and expressions that a language has at its disposal. A lexicon is also a synonym of the word thesaurus.
Grammar is the set of structural rules that govern the composition of clauses, phrases, and words in a language. It is an important aspect of a language, and it provides the rules for combining words and phrases to form sentences and for constructing meaning from those sentences.
Morphology is the study of the structure of morphemes and other linguistic units in a language. A morpheme is the smallest unit of meaning in a language, and morphology is concerned with the identification, analysis, and description of the structure of morphemes and other linguistic units.
Phenogrammatics deals with the concrete form of language, including how linguistic elements are combined and the order in which they are presented. Phenogrammatics is concerned with the linearization of the surface structure of a language.
Tectogrammatics is a term used in linguistics to refer to a level of representation in language that describes the underlying structure of sentences in a language. It is concerned with the way in which the words and phrases of a language are arranged and combined to form the syntactic structure of a sentence.

Rule-based Language Modeling

Rule-based language modeling is a type of language modeling that uses formal grammars to describe the structure and syntax of a language. Formal grammars are sets of rules and symbols that are used to specify the syntactic structure of a language. They are typically used to describe the way in which the words of a language can be combined to form correct sentences, and they can be used to generate or recognize valid sentences in a language.

In rule-based language modeling, a formal grammar is used to specify the structure of a language, and a language model is then trained on a large dataset of text written in that language. The language model is used to predict the likelihood of a given sequence of words being a valid sentence in the language, based on the rules and structures specified by the formal grammar. This allows the model to understand and generate human-like language, and it can be used for tasks such as machine translation, language generation, and text classification.

AIML (Artificial Intelligence Markup Language) is an example of a grammar that is encoded in XML form. It is used to develop natural language processing systems and chatbots, and it is based on the “bag of words” model.
Bag of words model is an unordered collection of words that disregards grammar. This model is often used in text classification and other applications, where the focus is on the individual words rather than on the grammar or structure of the sentences in which they appear.
Context-free grammars provide a mechanism for describing methods by which phrases are built, and they are often used to analyze and understand the structure of natural language inputs. Context-free phrase-structure grammars (PSGs) are a type of context-free grammar that are particularly popular and widely used.
Grammatical Framework (GF) is a programming language for writing grammars of natural languages. It is a type-theoretical grammar that is based on constructive type theory, which is derived from constructive mathematics. GF is used to write multi-modal grammars, and it is an important tool in the development of natural language processing systems and other applications.
Mixed-initiative grammars are a type of grammar that allow the caller to fill in multiple voice recognition fields with a single utterance. This can be useful in natural language processing applications where the user is able to provide input in a more flexible and open-ended way.
Precision grammar is a formal grammar that is designed to distinguish ungrammatical from grammatical sentences. It is a type of grammar that is used to analyze and understand the structure and meaning of natural language inputs.
Speech Recognition Grammar Specification (SRGS) is a standard for how speech recognition grammars are specified. It is used to define the structure and syntax of grammars for use in speech recognition systems. There are several different formats for specifying speech recognition grammars, including JSGF (Java Speech Grammar Format) and GSL (Nuance’s Grammar Specification Language).
Template grammars offer several advantages over the n-gram approach, including fewer inference steps, which results in fewer database queries. This can make template grammars more efficient and easier to use in natural language processing applications. Template grammars are also known as “defined grammar recognition” grammars. They are a type of grammar that is designed to recognize specific patterns or structures in natural language inputs.

Statistical Language Modeling

A statistical language model is a type of model that assigns probabilities to sequences of words. This can be useful in natural language processing applications, where the goal is to understand and interpret the meaning and structure of language inputs.

N-grams are probably the most common type of statistical language model. An n-gram is a sequence of n words that appear together in a given text or corpus. N-grams can be used to represent the frequency and probability of different word combinations, and they are often used in natural language processing applications to analyze and understand the structure of language inputs.

Concgrams are another type of statistical language model. They represent the relationship of any non-sequential n words in a given corpus, and they allow us to find all the words that are associated grammatically and semantically with a particular keyword by association.

Construction grammar is a theory of syntax that focuses on string construction, which is simpler than conventional syntax. It is a type of grammar that is used to analyze and understand the structure of natural language inputs.
Semantic grammar is a type of grammar that relies predominantly on semantic rather than syntactic categories. It is contrasted with conventional grammars, and it is used to analyze and understand the meaning of natural language inputs.
Dependency grammars are distinct from phrase structure grammars in that they lack phrasal nodes. They are a type of grammar that represents the dependencies between words in a sentence, rather than the phrase structure of the sentence.
Link grammar is a theory of syntax that builds relations between pairs of words. It is a type of grammar that is used to analyze and understand the structure of natural language inputs.

Combined Rule-based & Statistical Language Models

Combined rule-based and statistical language models are used to analyze and generate language using a combination of rule-based and statistical approaches. Formal grammars provide a way to specify the structure of a language using a set of rules, and these grammars can be used in conjunction with statistical language models to improve the accuracy of language processing tasks.

In a combined rule-based and statistical language model, the rule-based component is responsible for enforcing the constraints of the formal grammar, while the statistical component is used to predict the likelihood of different language sequences based on statistical patterns learned from a large dataset. This combination of rule-based and statistical approaches allows the language model to effectively analyze and generate language that follows the rules of the formal grammar, while also taking into account the statistical patterns that occur in natural language.

Combined rule-based and statistical language models are often used in natural language processing tasks, such as language translation, language generation, and language understanding, to improve the accuracy and fluency of the processed language.

Grammar induction is the process of learning a formal grammar from a set of observations. This can be useful in natural language processing and other applications, where the goal is to understand and interpret the structure and meaning of language inputs.
Stochastic context-free grammar (SCFG) is a type of context-free grammar in which each production is augmented with a probability. A context-free grammar is a type of grammar that provides a mechanism for describing the methods by which phrases are built, and a stochastic context-free grammar is a variant of this type of grammar that is probabilistic in nature. SCFGs are often used in natural language processing and other applications to analyze and understand the structure and meaning of language inputs.

Parsing

Parsing, also known as syntactic analysis, is the process of analyzing and interpreting the structure and meaning of a natural language input. It is a fundamental step in natural language processing and is typically performed before annotation, which is the process of adding metadata or other information to the input.

Syntactic parsers are based on grammatical rules, or grammars. They use these rules to analyze and interpret the structure of the input, and to identify the relationships between words and phrases.

Parsing expression grammar is a type of grammar that is used in parsing and is particularly well suited to string recognition tasks. It is a formalism that is closer to how string recognition is typically done in practice, and it is often used in natural language processing applications.
GLR (generalized left-to-right) parsers are an extension of the LR (left-to-right) parser algorithm that is designed to handle non-deterministic and more ambiguous grammars. They are often used in natural language processing and other applications to analyze and understand the structure of language inputs.
Grammar splitting is a technique that makes partial parsing possible by allowing each constituent to be parsed independently. This can be useful in situations where it is not necessary or practical to parse the entire input at once.
Serialization is the process of converting a data structure or object state into a format that can be stored or transmitted. It is a common operation in computer science, and it is often used to save the state of an object or data structure so that it can be restored or transmitted to another system.

Annotation

Annotation is the process of adding metadata or other information to a natural language input. It is a common operation in natural language processing and is often used to enrich the input and facilitate further analysis and interpretation.

Annotation may consist of part-of-speech (POS) tagging or tokenization. POS tagging is the process of identifying the part of speech of each word in a sentence, and it is often seen as the first stage of a more comprehensive syntactic annotation. Tokenization is the process of splitting a natural language input into smaller units called tokens, which are often words or phrases.

The first step in generating a domain-specific grammar is often to enrich the underlying ontology with linguistic information. This can involve adding grammatical, syntactic, or semantic tags, which are a form of semantic annotation or metadata.

Named-entity recognition (NER) is the process of identifying and extracting named entities from a natural language input. It typically begins with entity identification and annotation (named entity annotation) and culminates with entity extraction.

The parsed corpora that result from syntactic annotation are known as treebanks. There are two main types of treebanks: treebanks that annotate phrase structure, and those that annotate dependency structure, called dependency treebanks. Treebanks are often used in natural language processing and other applications to analyze and understand the structure and meaning of language inputs.

Dependency treebanks are treebanks that annotate the dependency structure of a sentence, rather than the phrase structure. In a dependency treebank, the grammatical relationships between words are represented using dependencies, which are arcs that connect a head word to its dependent words. Dependency treebanks are often used to analyze the grammatical structure of a sentence and to identify the roles that different words play in the sentence.
Entity extraction is the process of identifying and extracting named entities from a natural language input. A named entity is a real-world object or concept that can be named and referred to, such as a person, organization, or location. Entity extraction typically involves identifying and annotating named entities in a text, and then extracting them for further analysis or use.
Grammatical tags are tags that are used to annotate the grammatical properties of words or phrases in a text. These tags can indicate the part of speech of a word, the tense of a verb, the number of a noun, and other grammatical features. Grammatical tags are often used in natural language processing to analyze the structure and meaning of a text.
Named entity annotation is the process of identifying and annotating named entities in a natural language input. This process typically involves labeling named entities with tags or labels that indicate their type, such as person, organization, or location. Named entity annotation is an important step in the process of named-entity recognition and entity extraction.
Parsed corpora are collections of texts that have been syntactically annotated, or parsed, to represent their grammatical structure. Parsing is the process of analyzing a text to determine its grammatical structure, and parsed corpora are often used in natural language processing and other applications to analyze and understand the structure and meaning of language inputs.
Part-of-speech tagging (POS tagging) is the process of identifying the part of speech of each word in a sentence. Part of speech is a grammatical term that refers to the role that a word plays in a sentence, such as noun, verb, adjective, or adverb. Part-of-speech tagging is often seen as the first stage of a more comprehensive syntactic annotation, and it is used to analyze the grammatical structure of a text.
Semantic annotation is the process of adding meaning or semantic information to a natural language input. This can involve adding tags or labels that indicate the meaning of words or phrases, or linking the input to a larger ontology or knowledge base. Semantic annotation is often used to enrich the input and facilitate further analysis and interpretation.
Semantic tags are tags that are used to annotate the meaning of words or phrases in a text. These tags can indicate the sense or connotation of a word, the relationship between words, or the overall meaning of a phrase or sentence. Semantic tags are often used in natural language processing to analyze the meaning of a text.
Syntactic tags are tags that are used to annotate the syntactic properties of words or phrases in a text. These tags can indicate the part of speech of a word, the syntactic function of a word, or the syntactic structure of a phrase or sentence. Syntactic tags are often used in natural language processing to analyze the structure of a text.
Tokenization is the process of splitting a natural language input into smaller units called tokens, which are often words or phrases. Tokenization is an important step in natural language processing, as it allows the input to be analyzed and processed at the level of individual words or phrases, rather than as a continuous string of characters.

Grammars in Dialog Systems

Formal grammars can be used in dialog systems to constrain the structure and content of the dialog and to ensure that the dialog conforms to certain rules or conventions. For example, a formal grammar can be used to specify the allowable sequences of words or phrases that can be used in the dialog, as well as the relationships between different words or phrases.

In a dialog system, the formal grammar can be used to parse the user’s input and to determine the intended meaning of the user’s words. It can also be used to generate appropriate responses to the user’s input, ensuring that the system’s responses follow the rules of the grammar and are coherent and meaningful.

By using a formal grammar, a dialog system can ensure that the dialog stays on track and that the user’s input and the system’s responses are properly understood and interpreted. This can help to improve the overall quality and effectiveness of the dialog system.

Finite-state transducers (FST) are a type of computational model that can be used to recognize and generate words and sentences in a given language. They use grammars, which are formal rules specifying the structure and organization of a language, to control their behavior. FSTs are often used in computational linguistics applications, such as speech recognition and natural language processing.
Multilingual grammars are grammars that are designed to recognize and generate words and sentences in multiple languages. They formalize the idea that the same grammatical categories and syntax rules can appear in different languages, and can be used to build multilingual dialog systems.
Tree-adjoining grammars (TAG) are a type of formal grammar that uses rules for rewriting the nodes of trees as other trees to generate and recognize words and sentences in a language. They are often used in natural language processing applications.

Conclusion

Chomsky hierarchy is a classification of formal grammars (sets of rules for generating and recognizing strings in a formal language) based on the type of rules they use and the type of formal language they can describe. The hierarchy consists of four levels, each corresponding to a different class of grammars: regular grammars, context-free grammars, context-sensitive grammars, and unrestricted grammars. Regular grammars are the most restricted class and are capable of generating only the simplest kinds of languages. Context-free grammars are somewhat more powerful and can describe the structure of many programming languages. Context-sensitive grammars and unrestricted grammars are even more powerful, but their expressive power is less relevant in practice because they are much more difficult to work with.

Context-free grammar (CFG) is a type of formal grammar that specifies a context-free language, which is a type of language that can be recognized by a pushdown automaton. A context-free grammar consists of a finite set of production rules that specify how strings in the language can be constructed from a fixed alphabet of terminal and non-terminal symbols. Context-free grammars are used to generate and recognize strings that follow a specific set of syntactic rules, and they are useful for tasks such as parsing and syntax analysis.
Context-sensitive grammar (CSG) is a type of formal grammar that specifies a context-sensitive language, which is a type of language that can be recognized by a linear bounded automaton. A context-sensitive grammar consists of a finite set of production rules that specify how strings in the language can be constructed from a fixed alphabet of terminal and non-terminal symbols, and these rules may depend on the context in which the symbols appear. Context-sensitive grammars are more powerful than context-free grammars, but they are also more complex to use.
Regular grammar is a type of formal grammar that specifies a regular language, which is a type of language that can be recognized by a finite automaton. A regular grammar consists of a finite set of rules that specify how strings in the language can be constructed from a fixed alphabet of symbols. Regular grammars are used to generate and recognize strings that follow a specific pattern, and they are useful for tasks such as lexical analysis and pattern matching.
Unrestricted grammar is a type of formal grammar that does not impose any restrictions on the form of the strings it generates or recognizes. Unrestricted grammars are capable of generating and recognizing any possible string, and they are equivalent in power to Turing machines. Unrestricted grammars are not often used in practice because they are too powerful and difficult to work with.

Resources:

abl (alignment-based learning) .. a grammatical inference system that learns structure from plain sequences by comparing them
agg (automated grammar generator) .. patent us20050154580 .. a phrase chunking module operable to generate automatically at least one phrase
ale (attribute-logic engine) .. freeware logic programming and grammar parsing and generation system, written in prolog
amalgam .. automatic mapping among lexico-grammatical annotation models, part-of-speech (pos) tagger
apoema langbot .. defunct grammar checking api
ssw.jku.at/coco .. a compiler generator, which takes an attributed grammar of a source language and generates a scanner and a parser
eeggi.com .. (engineered, encyclopedic, global and grammatical identities) is “the world’s first information engine” .. (private beta)
gingersoftware.com .. grammar and spell checker
ghotit.com .. grammar & spell checker
grammar analyzer based on a part-of-speech tagged (post) parser .. patent us20030154066 .. the final grammar tree is then used to select the grammar tree with largest probability
wintertree-software.com/app/gramxp .. grammar expert plus grammar checker
grm library .. software library for creating and modifying statistical language models
kpml .. system offers a robust, mature platform for large-scale grammar engineering for natural language generation
link grammar parser .. assigns a set of labelled links connecting pairs of words, now maintained by abiword (formerly carnergie mellon link grammar parser) .. (see relex extension)
machinese syntax .. provides a full analysis of texts by showing how words and concepts relate to each other in sentences (functional dependency grammar parser)
nugram.nuecho.com .. platform is “the first – and only – complete speech grammar solution”
openccg.sourceforge.net .. open source natural language processing library written in java, based on combinatory categorial grammar
paperrater.com .. (wrote asking for grammar checker api)
pâté .. a visual and interactive tool for parsing and transforming grammars
proofreadbot.com/api .. free online grammar and style checker with rest server that returns reports in several formats
qtag .. a freely available, language independent pos-tagger, implemented in java
regulus grammar compiler .. an open source platform for compiling unification grammars into nuance compatible context free gsl grammars
whitesmoke.com .. english grammar checker software
ucrel.lancs.ac.uk/wmatrix .. a software tool for corpus analysis and comparison, providing a web interface to the ucrel usas and claws corpus annotation tools
xle-web .. interface for parsing and generating lexical functional grammars, a rich gui for writing and debugging
xtag .. on-going project to develop a wide-coverage grammar for english using a lexicalized tree adjoining grammar (tag) formalism

Wikipedia:

References:

Detecting Grammatical Errors with Treebank-Induced, Probabilistic Parsers (2012)
Grammatical error simulation for computer-assisted language learning (2011)
Grammatical framework: programming with multilingual grammars (2011)
GRASP: grammar-and syntax-based pattern-finder in CALL (2011)
Judging grammaticality with tree substitution grammar derivations (2011)
Quasi-Synchronous Phrase Dependency Grammars for Machine Translation (2011)
Speech Translation with Grammar Driven Probabilistic Phrasal Bilexica Extraction (2011)
A Knowledge-based Method for Grammatical Knowledge Extraction Process (2010)
Automated Grammatical Error Detection for Language Learners (2010)
Generating LTAG grammars from a lexicon-ontology interface (2010)
Hierarchical phrase-based translation with weighted finite-state transducers and shallow-n grammars (2010)
Modeling perspective using adaptor grammars (2010)
Modular Grammars for Speech Recognition in Ontology-Based Dialogue Systems (2010)
The NuGram approach to dynamic grammars (2010)
Are ontologies involved in natural language processing? (2009)
Automatic correction of grammatical errors in non-native English text (2009)
Creating a Semantic Wiki using a Link Grammar-based Algorithm for Relation Extraction (2009)
Grammar Engineering for CCG using Ant and XSLT (2009)
GREAT: a finite-state machine translation toolkit implementing a Grammatical Inference Approach for Transducer Inference (GIATI) (2009)
Improving data-driven dependency parsing using large-scale LFG grammars (2009)
Interactive pedagogical programs based on constraint grammar (2009)
Realistic Grammar Error Simulation using Markov Logic (2009)
Using construction grammar in conversational systems (2009)
Wanderlust: Extracting Semantic Relations from Natural Language Text Using Dependency Grammar Patterns (2009)
Words, Grammar, Text: Revisiting the Work of John Sinclair (2009)
Reusable grammatical resources for spatial language processing (2008)
Adaptor grammars: A framework for specifying compositional nonparametric Bayesian models (2007)
Combining Supertagging and Lexicalized Tree-Adjoining Grammar Parsing (2007)
Finding Metaphor in Grammar and Usage: A Methodological Analysis of Theory and Research (2007)
Rapid development of dialogue systems by grammar compilation (2007)
A graph grammar based framework for automated concept generation (2006)
Grammatical metaphor and lexical metaphor: Different perspectives on semantic variation (2006)
Putting linguistics into speech recognition: the Regulus grammar compiler (2006)
Comparison of Grammar-Based and Statistical Language Models Trained on the Same Data (2005)
Mental spaces in grammar: conditional constructions (2005)
Arboretum: Using a precision grammar for grammar checking in CALL (2004)
Translingual Grammar Induction for Conversational Systems (2004)
Treebanks: Building and Using Parsed Corpora (2003)
Intrinsic versus Extrinsic Evaluations of Parsing Systems (2003)
Embedded grammar tags: advancing natural language interaction on the Web (2002)
A Unified Context-Free Grammar And N-Gram Model For Spoken Language (2000)
Flexible speech act based dialogue management (2000)
Object-oriented techniques in grammar and ontology specification (2000)
A language model combining trigrams and stochastic context-free grammars (1998)
Robust grammatical analysis for spoken dialogue systems (1995)
A Cognitive Model of Sentence Interpretation: the Construction Grammar approach (1993)
Hybrid grammar-bigram speech recognition system with first-order dependence model (1992)
XTAG – A Graphical Workbench for Developing Tree-Adjoining Grammars (1992)