Database Reference
In-Depth Information
interaction with the various data stores in a telco company's transaction
processing system.
In many jurisdictions, there are a number of regulatory requirements sur-
rounding CDR data. As such, transaction processing systems require detailed
tracking of when a CDR has been successfully processed and written to the
CDR data store. The Streams TEDA application continuously monitors its
directories to detect new CDRs, and writes records to maintain the state of
each file. A parallelizing operator splits the CDRs into multiple parallel
branches (or paths) to expedite processing and apply full parallelization
techniques to the stream.
TEDA supports the Abstract Syntax Notation One (ASN.1) CDR format,
and other proprietary formats. For other CDR formats, the TEDA ingest rules
that can easily be customized to accommodate variations in those formats.
The TEDA includes over 700 rules that represent expert patterns that were
created by folks at IBM who live and breathe telco in order to facilitate the
CDR analysis process. The TEDA also includes a series of in-memory and
table look-ups to enrich the CDRs with information such as the customer's
importance, the customer ID, and estimated revenue for the call.
Streams also performs de-duplication of CDRs during enrichment. The
telco switches always create two copies of each CDR to prevent data loss, and
the duplicates must be deleted to ensure that customers are not billed twice.
The TEDA uses a Bloom Filter algorithm to eliminate duplicates, which opti-
mizes performance and memory consumption. Because of possible switch
failures, duplicate CDRs can appear up to 15 days later. This means that each
CDR must be compared against 15 days of data—potentially billions
of CDRs. Normally, this processing is done in the CDR data warehouse. With
the TEDA, however, it's now done simultaneously with CDR analysis, which
reduces the workload in the warehouse, and enables the warehouse to focus
on analytic and reporting applications, instead of de-duplicating records.
The final group of operators writes CDRs to the CDR repository (the
CDRs still need to be stored here for a myriad of reasons, such as regulatory
compliance, data governance, insight discovery, and more). After the TEDA
application receives confirmation that the CDRs have been written to the
repository, control information is sent back to the source operators to update
the CDR state information and to delete the relevant input files.
Search WWH ::




Custom Search