Database Reference
In-Depth Information
You'd  have to update every row in the flat reporting table for the orders that have that product
as opposed to one row in the Product table. The impact on performance is massive between
the two operations.
Figure 10-2: A flat reporting table.
Figure 10-3: Tables organized according to metrics and attributes.
Organize metrics according to granularity. Not all metrics have the same granularity.
When you're working through your design, consider the various metrics you need in your
analytics and take the time to organize them properly. An order can have many order items.
If you go the route of a single reporting table (refer to Figure 10-2), you have two choices:
Expand the grain of the table down to Order Item, or keep the table as is and add Order Item
Count to it. Both options have disadvantages. The first option forces you to allocate Order
Count among all the order items, and the second option forces you to lose the ability to see
the order items details. The recommended approach to this problem is to break out order
item into its own table, as shown in Figure 10-4.
Consider data volumes up front
A common pitfall when designing analytics systems is ignoring future data growth from the start. You
must consider the long term when designing your system, because it will typically be around for
many years. The primary principles to consider when handling large data volumes are
Follow an incremental update strategy: If you want your system to perform well, avoid
redundant work. If you run the system today and update all the data up to this point, you
don't want to repeat that step the next day if you can avoid it. Updating the system incre-
mentally involves only appending new data that came in since your last update. If done
properly, this should speed up data loads and allow you to handle increased data volumes
during a short processing window.
 
Search WWH ::




Custom Search