© May 2019 | IJIRT | Volume 5 Issue 12 | ISSN: 2349-6002
IJIRT 148026 INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN TECHNOLOGY 152
Python Libraries and Packages for Data Mining-A Survey
S.Sangeetha
1
, Dr. S. Saradhambekai
2
1
PG Student, Department of Information Technology, PSG College of Technology, Coimbatore-4, India
2
Assistant Professor (Sr. Gr), Department of Information Technology, PSG College of Technology,
Coimbatore-4, India
Abstract- Python is the one of the scripting language
that is simple, easy to learn syntax emphasizes
readability and therefore reduces the cost of program
maintenance. It is also an interpreter, object oriented
and high level programming language with dynamic
sematics. Using python packages for data mining
provide secure, customer acquisition, and improvement
in planning acquisition. It helps for analyser to analyse
the data for particular organisation.
In this paper, the survey of various papers that perform
with python modules and libraries for data mining are
attached and analyzed with metrics like performance,
reliability and stability because of using python
packages and libraries.
Index Terms- scripting, platform independent, flexible,
troubleshooted, performance, reliability, stability,
secure, ease of use.
I. INTRODUCTION
This survey paper depiction the various python
packages that are been used in data mining which
drastically increases the performance of mining
objects.
II. SIGNIFICANCE OF PYTHON
The significance of python is described as follows,
a. Python is a portable language
b. Python is a Beginner language.
c. Python is an Object-oriented scripting language
d. Python is a portable language
e. Python is high-level programming.
f. Python provides interfaces to all databases.
g. Python is an interactive language.
h. Python supports GUI Programming language.
i. Python supports very portable and cross-
platform compatible on UNIX, Windows and
Macintosh.
III. SIGNIFICANCE OF DATA MINING
The significance of networking is described as
follows,
a. Increase decision making.
b. Improves security risk posture.
c. Improves forecasting and planning.
d. Competitive advantage.
e. Customer acquisition.
f. Cost reduction.
g. Expand customer relationship.
h. New revenue streams.
i. Development of new products
IV. PYTHON LIBRARIES AND PACKAGES FOR
DATA MINING
a. NumPy
NumPy isa basic packages in python for scientific
computing. NumPy provides an extension to the
Python programming language for adding large,
multi-dimensional arrays and matrices, along with a
large library of high-level mathematical functions to
operate on the given arrays.
b. SciPy
SciPy is free and open-source software for
engineering, mathematics, and science. The SciPy
library based on NumPy, which provides convenient
and fast N-dimensional array manipulation. The
SciPy library is work with NumPy to build arrays,
and makes user-friendly and resourceful arithmetical
routines such as routines for mathematical
combination and optimization. Mutually, they
scamper on all admired operating systems, are quick
to set up, and no cost of charge.
c. Pandas
Pandas is a fast, elastic, and communicative data
structures consider to make running with “relational”
or “labeled” data both trouble-free and sensitive in
python package. It aims to be the deep-seated high-
level building block for doing realistic in actual