Database Reference
In-Depth Information
operators. The entire flow can be deemed as a production line of a factory, with
original data input and model results output. The operators can be regarded as
specific functions and feature different input and output characteristics.
￿
KNIME (21.8 %): KNIME (Konstanz Information Miner) is a user-friendly,
intelligent, and open-source-rich data integration, data processing, data analysis,
and data mining platform [ 4 ]. It allows users to create data flows or data channels
in a visualized manner, to selectively run some or all analytical procedures, and
provides analytical results, models, and interactive views. KNIME was written in
Java and, based on Eclipse, provides more functions as plug-ins. Through plug-
in files, users can insert processing modules to files, pictures, and time series,
and integrate them into various open source projects, e.g., R and Weka. KNIME
controls data integration, cleansing, conversion, filtering, statistics, mining,
and finally data visualization. The entire development process is conducted
under a visualized environment. KNIME is designed as a module-based and
expandable framework. There is no dependence between its processing units
and data containers, making them adaptive to the distributed environment and
independent development. In addition, it is easy to expand KNIME. Developers
can effortlessly expand various nodes and views of KNIME.
￿
Weka/Pentaho (14.8 %): Weka, abbreviated from Waikato Environment for
Knowledge Analysis, is a free and open-source machine learning and data mining
software written in Java. Weka provides such functions as data processing,
feature selection, classification, regression, clustering, association rule, and
visualization, etc. Pentaho is one of the most popular open-source commercial
intelligent software. It is a BI kit based on the Java platform. It includes a
web server platform and several tools to support report, analysis, chart, data
integration, and data mining, etc., all aspects of BI. Weka's data processing
algorithms are also integrated in Pentaho and can be directly called.
References
1. Theodore Wilbur Anderson, Theodore Wilbur Anderson, Theodore Wilbur Anderson, and
Theodore Wilbur Anderson. An introduction to multivariate statistical analysis , volume 2. Wiley
New York, 1958.
2. Xindong Wu, Vipin Kumar, J Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda,
Geoffrey J McLachlan, Angus Ng, Bing Liu, S Yu Philip, et al. Top 10 algorithms in data mining.
Knowledge and Information Systems , 14(1):1-37, 2008.
3. What analytics, data mining, big data software you used in the past 12 months for a real project?
http://www.kdnuggets.com/polls/2012/analytics-data-mining-big-data-software.html , 2012.
4. Michael R Berthold, Nicolas Cebron, Fabian Dill, Thomas R Gabriel, Tobias Kötter, Thorsten
Meinl, Peter Ohl, Christoph Sieb, Kilian Thiel, and Bernd Wiswedel. KNIME: The Konstanz
information miner . Springer, 2008.
Search WWH ::




Custom Search