What are the key challenges in designing/developing an effective question answering system?


What are the key challenges in designing/developing an effective question answering system?

You first have to define what kind of question answering system you are on about.  Many people refer to the software behind QA boards and forums, such as Quora, as a “question answering system”….   QA boards and forums are generally a form of crowd-sourced QA, which falls under the rubric of collective intelligence.  Then there are fully automated question answering systems, such as Siri, generally referred to as artificial intelligence.  Often question answer pairs may be originally crowd-sourced, and then be subsequently applied to AI.

Increasingly, at the enterprise level, CRM support systems may start with an in-house “knowledgebase” that has previously been “crowd-sourced”, for instance something like a “FAQ” – think question answer pairs.  Hybrid AI may then be applied to such an in-house knowledgebase, in the form of a question answering system; but on top of this, live support agents (people) may be integrated *transparently* to the end user to handle escalations from the AI – think “mechanical turk”.  Thus, question answer “corpora” represent a kind of crowd-sourced “training” for the AI.

Make no mistake, Quora co-founder Adam D’Angelo is widely reported to have invested in Vicarious Systems – to build “software that thinks and learns like a human”….  The center of gravity in QA [1] research has been the on-going series of workshops known as TREC (Text REtrieval Conference) [2], sponsored by US government agencies; for instance, IBM Watson [3] stems largely from IBM’s involvement in TREC.

[1] http://en.wikipedia.org/wiki/Question_answering

[2] http://en.wikipedia.org/wiki/Text_Retrieval_Conference

[3] http://www.mendicott.com/2011/01/how-many-playstations-make-watson.html