Biomedical Engineering Reference
In-Depth Information
Table 20.2
Comparison of open source BI frameworks
Pentaho
Jasper
Other
components
Reports and
charts
JFreeReports,
JFreeCharts
JasperReports,
JFreeCharts
Ad-hoc analysis
(drilldowns, etc.)
Pentaho Analyzer,
CDF, Weka, Saiku
JPivot
Workfl ow
Shark
'NA (Spring Web Flow
for UI WF) Scheduling:
Quartz'
Dashboards
CDF
Spring Web Flow,
SiteMesh, Spring
Security
APIs
JBoss/WSDL/
SOAP
AXIS
RServe, JRI
Data integration/
ETL
Kettle
Talend
Data quality/
MDM
None
Talend
Big Data
integration
Pentaho data
integrator for
Hadoop
JasperSoft Connectors
for Big Data
Cubing
Mondrian
Mondrian
Statistics
'Univariate
Statistics plug-in
Weka OpenBI R
Analytics plug-in'
RevoConnectR for
JasperReports Server
R
￿ ￿ ￿ ￿ ￿
Machine-
learning/AI
'Weka OpenBI R
Analytics plug-in'
RevoConnectR for
JasperReports Server
R, Mahout
There are several open source options for ETL. The two predominant
ones would be the Pentaho Data Integrator, which is built on Kettle
(sponsored by Pentaho Corporation), or Talend, which is used by
JasperSoft. The Pentaho Data integrator seems like a very good option
for this problem set. Not only has it been proven to be a solid choice for
relational mapping, it seems to lead the pack in support for Hadoop and
MapReduce integration. This framework allows for producing new
mappings or rule sets to evaluate against raw data sources, and
economically scaling this across a cloud-based compute cluster. We would
recommend the following staged process:
 
Search WWH ::




Custom Search