Partha Pakray, International Journal of Computer Science and Mobile Computing, Vol.3 Issue.6, June- 2014, pg. 528-534
© 2014, IJCSMC All Rights Reserved 528
Available Online at www.ijcsmc.com
International Journal of Computer Science and Mobile Computing
A Monthly Journal of Computer Science and Information Technology
ISSN 2320–088X
IJCSMC, Vol. 3, Issue. 6, June 2014, pg.528 – 534
RESEARCH ARTICLE
Text Grouping using Textual Entailment
Partha Pakray
Department of Computer & Information Science (IDI)
Norwegian University of Science and Technology (NTNU)
Trondheim, Norway
www.parthapakray.com
partha.pakray@idi.ntnu.no
Abstract — Textual Entailment is an important field in Natural Language Processing domain. Given two
texts called T (Text) and H (Hypothesis), the textual entailment recognition is the task of deciding whether
the meaning of H can be logically inferred from that of T. A Textual Entailment (TE) system has developed
and this system has tested on various entailment standard datasets. This TE will apply to different texts then
the TE system will group them into different single group. A corpus has created for this experiment that has
total 10 groups which contains 3540 sentences. F-score of the textual entailment system is 61% and will
detect 8 groups correctly from 10 groups.
Keywords— Natural Language Processing, Textual Entailment, reverb, Support Vector Machine
I. INTRODUCTION
Many efforts have devoted by the Natural Language Processing (NLP) community to develop advanced
methodologies in Textual Entailment (TE), which is considered as a core NLP task. Various international
conferences and several evaluation track competitions on Textual Entailment have been held, notably at
PASCAL-Pattern Analysis, Statistical Modelling and Computational Learning
1
, Text Analysis Conferences
(TAC)
2
organized by the United States National Institute of Standards and Technology (NIST), Evaluation
Exercises on Semantic Evaluation (SemEval)
3
, National Institute of Informatics Test Collection for Information
Retrieval System (NTCIR)
4
since 2005. Textual entailment can be more formally defined [1] as
A text T entails a hypothesis H, if H is true in every circumstance in which T is true.
A text T entails a hypothesis H if, typically, a human reading T would infer that H is most likely true.
For example, the text T = “John’s assassin is in jail” entails the hypothesis H = “John is dead”; indeed, if
there exists one’s assassin, then this person is dead. Similarly, T = “Mary lives in France” entails H = “Mary
lives in Europe”. On the other hand, T = “Mary lives in Europe” does not entail H = “Mary lives in US”.
Main focus of this experiment is that Text Grouping (i.e. clustering) can do by Textual Entailment. For this
experiment own developed TE system used that already developed previously and participated various
Recognising Textual Entailment (RTE) Challenges and tested on RTE datasets. This TE system has successfully
applied to Question Answering (QA) domain and participated QA track (QA4MRE) at Conference and Labs of
1
http://pascallin.ecs.soton.ac.uk/Challenges/
2
http://www.nist.gov/tac/tracks/index.html
3
http://semeval2.fbk.eu/semeval2.php
4
http://research.nii.ac.jp/ntcir/ntcir-9/