Wikipedia Corpora

 To Do : Create a table comparing WikiQA and SQuAD and say we are going to choose WikiQA (for the moment)



WikiQA: 

WikiQA is a set of question and sentence pairs, collected and annotated for research on open-domain QA. It includes questions for which there are no correct sentences, enabling researchers to work on answer triggering.
WikiQA introduced the task of answer triggering and was the only answer triggering dataset.
Questions type
The questions are originally sampled from Bing query logs.
This corpus has 3047 questions (in raw data file WikiQA.tsv ). It contains both general and factoid questions. 
-HOW: eg. how did Athenians make money? how does interlibrary loan work? (NO WHY Questions)

-What, How many, Who, when, Where etc.
Date/Version
 Version 1.0: August 25, 2015
Answer triggering
- We propose the answer triggering task, a new challenge for the question answering problem, which requires QA systems to: (1) detect whether there is at least one correct answer in the set of candidate sentences for the question; (2) if yes, select one of the correct answer sentences from the candidate sentence set.
- Answer Triggering problem: Given a question and a set of answer candidates, determine whether the candidate set contains any correct answer, and if so, select a correct answer as system output.
Notes
"In the end, we included 3,047 questions and 29,258 sentences in
the dataset, where 1,473 sentences were labeled as answer sentences to their corresponding questions." This means that less than half of the questions have their relevant answers in the summary section of the Wikipedia articles. "Specifically, we find nearly two-thirds of questions contain no correct answers
in the candidate sentences". Some questions have more than one sentence as an answer eg. What is section eight housing?

"Although not used in the experiments, each of these answer sentence is associated with the answer phrase, which is defined as the shortest substring of the sentence that answers the question." Not clear.
Almost two thirds of the questions are factoid ones. I am not sure if we can consider description questions (HOW) as general ones.


SQuAD: 


Stanford Question Answering Dataset (SQuAD) is a new reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage. With 100,000+ question-answer pairs on 500+ articles, SQuAD is significantly larger than previous reading comprehension datasets.

Questions type

  • General Questions:

"If Roman numerals were used, what would Super Bowl 50 have been called?"
"Why are the small lakes in the parks emptied before winter?"
"How do centripetal forces act in relation to vectors of velocity?"

  • Factoid Questions:
How many, How old, What, Where, When, Who, etc.

Date/Version 2016 Download 


Why-question corpus


Since 2008 and not available for download link University link

Questions:
what is the difference between a reading comprehension and question answering corpus?


SelQA - Answer Sentence Selection

Our corpus is similar to WikiQA but covers more diverse topics, consists of a larger number of questions (about 6 times larger for answer sentence selection and 2.5 times larger for answer triggering), and makes use of more contexts by extracting contexts from the entire article instead of from only the abstract.

- Selection-based question answering is the task of selecting a segment of text, or interchangeably a context, from a provided set of contexts that best answers a posed question. Selection-based question answering is subdivided into answer sentence selection and answer triggering.
- Context is a single document section, a group of contiguous sentences, or a single sentence. 
- Answer sentence selection is defined as ranking sentences that answer a question higher than the irrelevant sentences where there is at least a single sentence that answers the question in a provided set of candidate sentences.
- Answer triggering is defined as selecting any number (n >= 0) of sentences from a set of candidate sentences that answers a question where the set of candidate sentences may or may not contain sentences that answer the question.

















Comments

Popular posts from this blog

Links

Extraction Model