1.Introduction
Watson is a Question Answering system developed by IBM to compete against human opponents in Jeopardy which is a well-known television quiz show with a twist. Contestants are asked to phrase the question of a given answer from a selected category. The game of Jeopardy is considered a hard game for computers given its unstructured nature of questions from a vast number of domains and types.
2.History
Before the start of the Watson project the team at IBM worked on PIQUANT, a static QA system with a predetermined set of answers with a limited domain which constitutes the work team has done before Watson. Another project that helped the team work on Watson is UIMA stands for Unstructured Information Management Architecture …show more content…
PIQUANT provided a baseline measure during the project with an attempt rating of 70% and 16% correctness. In order to successfully win against grand champions of Jeopardy, the team estimated that Watson had to attempt 70% of questions with 85% correctness.
3.DeepQA
DeepQA which constitutes the major portion of the Watson project is responsible for understanding a question, decomposes it, finds candidate answers and scores them based on some evidence found related to the candidates.
3.1.Question Analysis - How Watson Reads a Clue
DeepQA never assumes that any component perfectly understands the question. It defers a commitment to an answer while more evidence is gathered. DeepQA applies hundreds of algorithms that analyze evidence by different measures as features and scores each indicating a degree to which an evidence supports an answer according to that feature. Features are then combined to make a ranked candidate answer list with a score corresponding to their correctness probabilities. Watson then determines whether to attempt to answer or pass the round. This decision is based on whether the top candidate answer has a confidence probability above a certain …show more content…
This is done with a novel method authors call “Supporting Evidence Retrieval” which different queries are performed for each candidate answer.
Passages retrieved are then scored using different algorithms.
3.8.Relation Extraction and Scoring in DeepQA
3.9.Structured Data and Inference in DeepQA
In order to be used by DeepQA, unstructured data requires a precise translation into a formal representation and the underlying already structured data requires encoding in a form suitable to answering questions. These two issues combined make the usage of structured data as the only source impractical which is why DeepQA is mostly based on unstructured data. Finding an answer from structured data is rare however is relatively more precise if found.
Structured data in DeepQA can be summarized in four types. Large online databases such as Wikipedia, large collections of automatically extracted data from unstructured sources, a small amount of manually added sources to account for differences between task domain and the source and lastly a small amount of manually added formal knowledge targeting most common questions/answers.
3.10.Special Questions and Techniques
3.11.Identifying Implicit