Database Reference
In-Depth Information
The challenge here is that the data can come from anywhere. Users are
able to collate data from literally any source for their own reporting. The
opportunities presented by self-service models has led to a decentralized
view of data. This can make certain types of data analysis challenging.
Ideally, we want to leverage compute resources as close to the data as
possible and integrate these sources in a seamless way.
However, to achieve this we may need to take an increasingly logical view
of our data warehouse. We wouldn't, for example, want to move masses
of cloud-born web analytics and machine telemetry data down to an
on-premise warehouse to perform an in-depth path analysis. The time it
would take to continually stream this data may lead to unacceptable latency
in our reporting. This latency may erode the value of the insights presented.
Therefore, we might need to take a more practical approach.
24/7 Operations
Business is global—at least that is what all the Hong Kong Shanghai Bank
of China (HSBC) adverts keep telling me. This means that there is always
someone, somewhere, who is awake and wanting to interact with our
services. At least we hope that is the case. What this has led to is
considerable pressure being placed on “operational” outages. Backup
windows, load windows, and patch management have all felt the force of a
24/7 business.
These days, huge emphasis is placed on loading data really quickly while
also having to balance the needs of the business for reporting. We need to
build processes that offer balance. It's no good taking the system offline to
load data. We need to target the mixed workload.
Near Real-time Analysis
Roll back the time transaction five years and data warehouses were deemed
up-to-date when they had yesterday's data loaded inside them. At first there
wasresistance.Whoreallyneededreportingforthelasthourorevenquarter
of an hour? Roll forward five years and now data warehouses are being
used for fraud detection, risk analytics, cross-selling, and recommendation
engines to name but a few use cases that need near real time.
To get really close to what is happening in real time, we need to take the
analytics closer to the source. We may no longer be able to afford the
multiple repository hops we traditionally see with traditional architectures.
Search WWH ::




Custom Search