International Journal of Soft Computing and Engineering (IJSCE)
ISSN: 2231-2307, Volume-5 Issue-1, March 2015
24
Retrieval Number: A2517035115/2015©BEIESP
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
An Advanced Precision based Approach to String
Transformation
B. Sankara Babu, K. Rajasekhar Rao, P. Satheesh
Abstract: Distinct obstacles occur in Natural language
processing, Knowledge Engineering, Information Retrieval,
Genetics Informatics, Computational molecular biology and Data
Mining concerned to String Transformation. Consider an input
string, the system automatically produces top k output strings
referring to input string. Generally people perform various kinds
of spelling errors such as misspell words accidentally while
surfing the web. To circumvent such errors, this Paper
propounds an advanced Precision based approach to string
transformation which is very accurate. The proposed system
comprises unique precision value allocated to each alphabet and
these are aggregated to give the Total Precision of the particular
word. Data sets are trained with the precision based approach by
validating them to dictionary called the database. Misspell word
precision is compared with the data sets precision and retrieves
the top k nearest neighbour output strings relevant to input
string. This is one of the best accurate Misspell word and
sentence correction approach and experimentally proven on
large data sets.
Keywords: String Transformation, Precision based Approach,
Misspell words, Total Precision.
I. INTRODUCTION
Required transformations that are necessary for the
replacement of source string with different destination
strings defined as String Transformation. In Natural
language Processing, String Transformation is illustrated as
Misspell word correction, word metamorphosis, word
transition. In Information Retrieval, String transformation
deals with retrieving the relevant records regarding the
query. In Genetic Informatics, it is used to detect the disease
in a person by using the standardized DNA patterns. In Data
mining it is used as excavating the metonyms and opposite
words. In String transformation, Input string is the string
comprises the words, group of letters, tokens. Top k
corrected output strings are produced by conducting various
transformations on an input string which is misspell. Output
strings are the strings which are corrected and generated by
referring to the dictionary. Transformation is the several
operations to replace the input string with different output
strings. Correction of misspell word is done in two steps. In
First step, it deals with the aspect of Keywords generation.
In this step Misspelled words are corrected and distinct
keywords are generated.
Manuscript Received on February 16, 2015.
B. Sankara Babu, Assoc. Prof., Department of CSE, Gokaraju
Rangaraju Institute of Engineering and Technology Hyderabad, India.
Dr. K. Rajasekhar Rao, Prof., Department of CSE, Koneru Laxmaiah
University Guntur, India.
Dr. P. Satheesh, Assoc. Prof., Department of CSE, Maharaj Vijayaram
Gajapathiraj College of Engineering Vizianagaram, India.
For example consider a misspelled word “Thught”, here the
original word is “Thought “, here the letter’ o’ should be
placed to make the misspelled word as correct word.
Similarly, many other words are generated related to the
misspelled word. In second step it deals with the aspect of
Best key word selection. In this step Best key word is
selected from distinct keywords based on the priority. The
words which are wrongly typed by one character or group of
characters to which certain modifications are needed to
make them as correct words are defined as Misspell
words.Table1 describes the misspell words correction. In the
table we considered wrong words which are corrected by
operations such as insertion, deletion, substitution to make
them into original words.
Table 1. Describes the misspell words correction.
Misspell
word
Original
Word
Wrongly
typed
character
Correction
Action
Performed
Flowr Flower - e inserted
Mengo Mango e - inserted
Beauetiful Beautiful - - deleted
Technlgy Technology e oo inserted
Innovatiev
e
Innovative e - deleted
Approech Approach e - substituted
Word document contains misspell words are corrected
syntactically and semantically by implicitly using ABC
Spelling and Grammar.
Figure1: Illustrates the correction of misspell word in
word pad.