How do I design a system to query the database based on natural language input?
This may be referred to as NLDB (Natural Language Database) or NLIDB (Natural Language Interface to Databases). Search in general is more or less based on natural language processing. Most, if not all, dialog systems are more or less a form of search. Most dialog systems are either based on one or another markup language (so-called pattern matching), statistically/probabilist
In pattern matching dialog systems based on markup language (such as the prototypicalAIML), the annotations function as triggers for the search algorithm. For instance, patterns can be combined in different ways, such as with wildcards, in order to create the impression/illusion of natural language dialog. In this way, a pattern matching wizard can pull the wool over someone’s eyes in order to win a Turing test, such as the Loebner Prize. (The very reason “natural language” is “natural” is because the human unconscious is ruled by “mind games”, theater, and mythology….)
In short – at this point, I don’t know of any practical way of getting around annotating your data, or corpus. It is the annotation of data, or Metadata, that turns a “database” into a “knowledgebase”. (I have yet to see, much less test, any functional dialog system based primarily on theoretical deep learning, or layered neural networks.)
See the answers to my Quora question:
See also my quick and dirty webpages: