Database Reference
In-Depth Information
worldwide. With the proliferation of data islands (or spreadmarts), the need to
centralize the data is more pressing than ever.
As data needs grew, so did more scalable data warehousing solutions. These
technologies enabled data to be managed centrally, providing benefits of security,
failover, and a single repository where users could rely on getting an “official”
source of data for financial reporting or other mission-critical tasks. This structure
also enabled the creation of OLAP cubes and BI analytical tools, which provided
quick access to a set of dimensions within an RDBMS. More advanced features
enabled performance of in-depth analytical techniques such as regressions and
neural networks. Enterprise Data Warehouses (EDWs) are critical for reporting
and BI tasks and solve many of the problems that proliferating spreadsheets
introduce, such as which of multiple versions of a spreadsheet is correct.
EDWs—and a good BI strategy—provide direct data feeds from sources that are
centrally managed, backed up, and secured.
Despite the benefits of EDWs and BI, these systems tend to restrict the flexibility
needed to perform robust or exploratory data analysis. With the EDW model, data
is managed and controlled by IT groups and database administrators (DBAs), and
data analysts must depend on IT for access and changes to the data schemas.
This imposes longer lead times for analysts to get data; most of the time is spent
waiting for approvals rather than starting meaningful work. Additionally, many
times the EDW rules restrict analysts from building datasets. Consequently, it is
common for additional systems to emerge containing critical data for constructing
analytic datasets, managed locally by power users. IT groups generally dislike
existence of data sources outside of their control because, unlike an EDW, these
datasets are not managed, secured, or backed up. From an analyst perspective,
EDW and BI solve problems related to data accuracy and availability. However,
EDW and BI introduce new problems related to flexibility and agility, which were
less pronounced when dealing with spreadsheets.
A solution to this problem is the analytic sandbox, which attempts to resolve the
conflict for analysts and data scientists with EDW and more formally managed
corporate data. In this model, the IT group may still manage the analytic
sandboxes, but they will be purposefully designed to enable robust analytics, while
being centrally managed and secured. These sandboxes, often referred to as
workspaces, are designed to enable teams to explore many datasets in a controlled
fashion and are not typically used for enterprise-level financial reporting and sales
dashboards.
Search WWH ::




Custom Search