www.kips.or.kr Copyright© 2017 KIPS
Extraction of ObjectProperty-UsageMethod
Relation from Web Documents
Chaveevan Pechsiri*, Sumran Phainoun*, and Rapeepun Piriyakul**
Abstract
This paper aims to extract an ObjectProperty-UsageMethod relation, in particular the HerbalMedicinalProperty-
UsageMethod relation of the herb-plant object, as a semantic relation between two related sets, a herbal-
medicinal-property concept set and a usage-method concept set from several web documents. This
HerbalMedicinalProperty-UsageMethod relation benefits people by providing an alternative treatment/solution
knowledge to health problems. The research includes three main problems: how to determine EDU (where
EDU is an elementary discourse unit or a simple sentence/clause) with a medicinal-property/usage-method
concept; how to determine the usage-method boundary; and how to determine the HerbalMedicinalProperty-
UsageMethod relation between the two related sets. We propose using N-Word-Co on the verb phrase with
the medicinal-property/usage-method concept to solve the first and second problems where the N-Word-Co
size is determined by the learning of maximum entropy, support vector machine, and naïve Bayes. We also
apply naïve Bayes to solve the third problem of determining the HerbalMedicinalProperty-UsageMethod
relation with N-Word-Co elements as features. The research results can provide high precision in the
HerbalMedicinalProperty-UsageMethod relation extraction.
Keywords
Medicinal Property, N-Word-Co, Semantic Relation, Usage-Method
1. Introduction
The objective of this research is to extract an ObjectProperty-UsageMethod relation, especially a
HerbalMedicinalProperty-UsageMethod relation of an herb-plant object, from downloaded documents
from several websites. The downloaded document contents comprise the object names (i.e., the herb
plant names) as the topic names and the explanation of several kinds of property (i.e., physical
properties, chemical properties, and medicinal properties) and the methods of usage of the objects. The
explanation content for herb plants is indigenous knowledge about curing certain diseases effectively
even though some disease treatments by medicinal plants are time consuming. However, the result of
searching for the herb plant knowledge on both the medicinal properties and usage methods from the
web-sites to solve health problems is a list of documents that the user has to read in order to extract the
required knowledge. Therefore, it is necessary to automatically extract the HerbalMedicinalProperty-
※ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which
permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Manuscript received March 8, 2017; first revision June 27, 2017; accepted June 28, 2017.
Corresponding Author: Chaveevan Pechsiri (chaveevan.pec@dpu.ac.th)
* College of Innovative Technology and Engineering, Dhurakijpundit University, Bangkok, Thailand (chaveevan.pec@dpu.ac.th,
sumran.pha @dpu.ac.th)
** Dept. of Computer Science, Ramkhamhaeng University, Bangkok, Thailand (rapepunnight@yahoo.com)
J Inf Process Syst, Vol.13, No.5, pp.1103~1125, October 2017 ISSN 1976-913X (Print)
https://doi.org/10.3745/JIPS.04.0046 ISSN 2092-805X (Electronic)