Question generation using state-of-the-art Natural Language Processing algorithms
Questgen AI is an opensource NLP library focused on developing easy to use Question generation algorithms.
It is on a quest build the worldโs most advanced question generation AI leveraging on state-of-the-art transformer models like T5, BERT and OpenAI GPT-2 etc.
๐ Our online course that teaches how to build these models from scratch and deploy them
1. Multiple Choice Questions (MCQs) 2. Boolean Questions (Yes/No) 3. General FAQs 4. Paraphrasing any Question 5. Question Answering.
pip install git+https://github.com/ramsrigouthamg/Questgen.ai
pip install git+https://github.com/boudinfl/pke.git
python -m nltk.downloader universal_tagset
python -m spacy download en
wget https://github.com/explosion/sense2vec/releases/download/v1.0.0/s2v_reddit_2015_md.tar.gz
tar -xvf s2v_reddit_2015_md.tar.gz
from pprint import pprint
import nltk
nltk.download('stopwords')
from Questgen import main
qe= main.BoolQGen()
payload = {
"input_text": "Sachin Ramesh Tendulkar is a former international cricketer from India and a former captain of the Indian national team. He is widely regarded as one of the greatest batsmen in the history of cricket. He is the highest run scorer of all time in International cricket."
}
output = qe.predict_boolq(payload)
pprint (output)
'Boolean Questions': ['Is sachin ramesh tendulkar the highest run scorer in '
'cricket?',
'Is sachin ramesh tendulkar the highest run scorer in '
'cricket?',
'Is sachin tendulkar the highest run scorer in '
'cricket?']
qg = main.QGen()
output = qg.predict_mcq(payload)
pprint (output)
{'questions': [{'answer': 'cricketer',
'context': 'Sachin Ramesh Tendulkar is a former international '
'cricketer from India and a former captain of the '
'Indian national team.',
'extra_options': ['Mark Waugh',
'Sharma',
'Ricky Ponting',
'Afridi',
'Kohli',
'Dhoni'],
'id': 1,
'options': ['Brett Lee', 'Footballer', 'International Cricket'],
'options_algorithm': 'sense2vec',
'question_statement': "What is Sachin Ramesh Tendulkar's "
'career?',
'question_type': 'MCQ'},
{'answer': 'india',
'context': 'Sachin Ramesh Tendulkar is a former international '
'cricketer from India and a former captain of the '
'Indian national team.',
'extra_options': ['Pakistan',
'South Korea',
'Nepal',
'Philippines',
'Zimbabwe'],
'id': 2,
'options': ['Bangladesh', 'Indonesia', 'China'],
'options_algorithm': 'sense2vec',
'question_statement': 'Where is Sachin Ramesh Tendulkar from?',
'question_type': 'MCQ'},
{'answer': 'batsmen',
'context': 'He is widely regarded as one of the greatest '
'batsmen in the history of cricket.',
'extra_options': ['Ashwin', 'Dhoni', 'Afridi', 'Death Overs'],
'id': 3,
'options': ['Bowlers', 'Wickets', 'Mccullum'],
'options_algorithm': 'sense2vec',
'question_statement': 'What is the best cricketer?',
'question_type': 'MCQ'}]}
output = qg.predict_shortq(payload)
pprint (output)
{'questions': [{'Answer': 'cricketer',
'Question': "What is Sachin Ramesh Tendulkar's career?",
'context': 'Sachin Ramesh Tendulkar is a former international '
'cricketer from India and a former captain of the '
'Indian national team.',
'id': 1},
{'Answer': 'india',
'Question': 'Where is Sachin Ramesh Tendulkar from?',
'context': 'Sachin Ramesh Tendulkar is a former international '
'cricketer from India and a former captain of the '
'Indian national team.',
'id': 2},
{'Answer': 'batsmen',
'Question': 'What is the best cricketer?',
'context': 'He is widely regarded as one of the greatest '
'batsmen in the history of cricket.',
'id': 3}]
}
payload2 = {
"input_text" : "What is Sachin Tendulkar profession?",
"max_questions": 5
}
output = qg.paraphrase(payload2)
pprint (output)
{'Paraphrased Questions': ["ParaphrasedTarget: What is Sachin Tendulkar's "
'profession?',
"ParaphrasedTarget: What is Sachin Tendulkar's "
'career?',
"ParaphrasedTarget: What is Sachin Tendulkar's job?",
'ParaphrasedTarget: What is Sachin Tendulkar?',
"ParaphrasedTarget: What is Sachin Tendulkar's "
'occupation?'],
'Question': 'What is Sachin Tendulkar profession?'}
answer = main.AnswerPredictor()
payload3 = {
"input_text" : '''Sachin Ramesh Tendulkar is a former international cricketer from
India and a former captain of the Indian national team. He is widely regarded
as one of the greatest batsmen in the history of cricket. He is the highest
run scorer of all time in International cricket.''',
"input_question" : "Who is Sachin tendulkar ? "
}
output = answer.predict_answer(payload3)
Sachin ramesh tendulkar is a former international cricketer from india and a former captain of the indian national team.
payload4 = {
"input_text" : '''Sachin Ramesh Tendulkar is a former international cricketer from
India and a former captain of the Indian national team. He is widely regarded
as one of the greatest batsmen in the history of cricket. He is the highest
run scorer of all time in International cricket.''',
"input_question" : "Is Sachin tendulkar a former cricketer? "
}
output = answer.predict_answer(payload4)
print (output)
Yes, sachin tendulkar is a former cricketer.
For maintaining meaningfulness in Questions, Questgen uses Three T5 models. One for Boolean Question generation, one for MCQs, FAQs, Paraphrasing and one for answer generation.