The IBM Big Data Analytic Accelerators - Harness the Power of Big Data

Database Reference

In-Depth Information

interaction with the various data stores in a telco company's transaction

processing system.

In many jurisdictions, there are a number of regulatory requirements sur-

rounding CDR data. As such, transaction processing systems require detailed

tracking of when a CDR has been successfully processed and written to the

CDR data store. The Streams TEDA application continuously monitors its

directories to detect new CDRs, and writes records to maintain the state of

each file. A parallelizing operator splits the CDRs into multiple parallel

branches (or paths) to expedite processing and apply full parallelization

techniques to the stream.

TEDA supports the Abstract Syntax Notation One (ASN.1) CDR format,

and other proprietary formats. For other CDR formats, the TEDA ingest rules

that can easily be customized to accommodate variations in those formats.

The TEDA includes over 700 rules that represent expert patterns that were

created by folks at IBM who live and breathe telco in order to facilitate the

CDR analysis process. The TEDA also includes a series of in-memory and

table look-ups to enrich the CDRs with information such as the customer's

importance, the customer ID, and estimated revenue for the call.

Streams also performs de-duplication of CDRs during enrichment. The

telco switches always create two copies of each CDR to prevent data loss, and

the duplicates must be deleted to ensure that customers are not billed twice.

The TEDA uses a Bloom Filter algorithm to eliminate duplicates, which opti-

mizes performance and memory consumption. Because of possible switch

failures, duplicate CDRs can appear up to 15 days later. This means that each

CDR must be compared against 15 days of data—potentially billions

of CDRs. Normally, this processing is done in the CDR data warehouse. With

the TEDA, however, it's now done simultaneously with CDR analysis, which

reduces the workload in the warehouse, and enables the warehouse to focus

on analytic and reporting applications, instead of de-duplicating records.

The final group of operators writes CDRs to the CDR repository (the

CDRs still need to be stored here for a myriad of reasons, such as regulatory

compliance, data governance, insight discovery, and more). After the TEDA

application receives confirmation that the CDRs have been written to the

repository, control information is sent back to the source operators to update

the CDR state information and to delete the relevant input files.

Search WWH ::

Custom Search

Home