Data Mining and Mobile Business Data

INTRODUCTION

Research and practices in mobile (m-) business have seen an exponential growth in the last decade (CNN, 2002; Leisen, 2000; McDonough, 2002; Purba, 2002). M-businesses allow users to access information, perform transactions and other operations from anywhere at anytime via wireless networks. Consequently, m-business applications are generating a large volume of complex data (Magic-sw, 2002). Monitoring and mining of this data can assist m-business operators to make sound financial and organisational decisions.
Data mining (DM) or knowledge discovery in databases is the extraction of interesting, meaningful, implicit, previously unknown, valid and actionable information from a pool of data sources (Dunham, 2003). This valuable and real-time information inferred from the data can be used for decision-making. For example, common use of mobile phones and personal digital assistance (PDAs) has increased the number of service providers. The DM technology can help providers to develop services and sales strategies for future benefits. An example of existing applications of data mining in m-business is MobiMine (Kargupta, Park, Pittie, Liu, Kushraj, & Sarkar, 2002), which enables a user to monitor stock prices from a handheld PDA.

BACKGROUND: PROCESS OF KNOWLEDGE DISCOVERY

Data mining – an interactive, iterative, non-trivial process - is usually divided into many subtasks (Figure 1). Prior to commencing the mining process, businesses should identify and define their goals, objectives and limitations. Accordingly, data is gathered and collated from multiple sources as each source may send data in different formats. The next phase is to ensure quality of the data by removing noise, handling missing information and transforming to an appropriate format. A reduced volume of the data set ” representative of the overall processed data ” is also derived by applying data reduction techniques.
Once the data is pre-processed, an appropriate data mining technique or a combination of techniques is applied for the type of knowledge to be discovered (Table 1). The discovered knowledge is then evaluated and interpreted, typically involving some visualization techniques. When the mined results are determined insufficient, an iterative process of performing preprocessing and mining begins until adequate and useful information is obtained. Lastly, the information is presented to user to incorporate into the company’s business strategies.


Figure 1. The data mining process

The data mining process

Table 1. Various data mining tasks

Mining Task Goal Approaches
Predictive Modelling To predict future needs based on previous data Decision tree, Neural networks
Clustering To partition data into segments Demographic, Neural networks
Link Analysis To establish association among items Counting occurrences of items such as Apriori Algorithms
Deviation Detection To detect any anomalies, unusual activities Summarization and Graphical representation

DATA MINING OPPORTUNITIES IN M-BUSINESS DOMAIN

Taking Advantage of Location Information

With the Global Positioning System (GPS) mobile technology, it is possible to identify the location of users (Cousins & Varshney, 2001; Duri, Cole, Munson, & Christensen, 2001). Based on the locations that a person frequents most and the personal information given, it is possible to classify the user in a pre-defined category with data mining techniques. For example, if a person is most often sighted in supermarkets, department stores and at home, and is seen shuttling between sales events, then this person can be classified as a possible homemaker interested in sales events. In terms of a business-to-consumer relation, such information allows businesses to provide the appropriate marketing information to the specific category of users.
In terms of a business-to-business relation, the ability to track the location of the employees is ideal to determine the work efficiency of the employees. Analysis of employee’s time spent on the duty will determine the employee who is performing best and most suited for the next pay increment and promotion. Businesses like courier companies are dependent on the information regarding the locations of the transported parcels. Data mining techniques are able to analyse various routes and time spent in receiving parcels over a period. The outcome determines the efficiency of the business processes and factors behind their failure or success.

Personalization of M-Business Applications

Due to the limited screen space provided on mobile devices, it is difficult for mobile users to browse the product or service catalogues on the devices. It is important for vendors to provide only the products or services that match the needs of individual users. Short message service (SMS) is used primarily for simple person-to-person messaging. Information obtained from analysing the user data about previously accessing these services can be used to create personalized advertises to the customer delivered by SMS (Mobilein.com, 2002).
Relevant services can be offered based not only on the personal profile of the device holder, but also on the device holder’s location and time factor. For example, m-business applications used in the travel industry can assist users to find attractions, hotels and restaurants of their preference on requested location and time. The clustering data mining technique groups the customers with similar preferences. When a new customer mentions his preferences, a recommendation can be made based on the previous similar preferences. Associative data mining can be used to indicate which places a person is most likely to visit in a single trip or in two consecutive trips, with having inputs such as location and time of visits to attractions for each user. This provides great convenience for users as these services can be used while driving, for example, a suggestion can be made based on the association rule that if the user is on place A then the user should visit the place B, previous 80% visitors have done so.

Predicting Customer Buying and Usage Patterns

Service providers can analyse the consumer behaviour data (e.g., by analysing gateway log files and content server log files on WAP) and predict the consumers buying and usage patterns, or to understand how mobile subscribers use their wireless services. Using the stored data, companies can apply data mining to identify customer segments using clustering data mining techniques, to distinguish customers’ consumption patterns using deviation detection techniques, and transaction trends using associative data mining techniques. This information can then be used to provide better services to the customers or to attract potential customers.

Predicting Future and Better Usage of Mobile Technology

Data, about the number of mobile phones in the market, the number of users subscribing a service, the amount of usage measured in currency, the users’ satisfaction and feedback, can be extracted and analysed with data mining. The resulting information can be used to predict the trends and patterns of usage of mobile phones and services. For example, some of the popular services bought through m-commerce technology are mobile ringing tones, logos and screensavers. The most common used interface for these kind of transactions are short message service (SMS) and the standard e-commerce interface, the Internet. An example is Nokia’s focus on screensavers, logos and ringing tone availability. This is most likely to be a result of previous research on their users’ trends, by capturing the data on the users’ demands and needs, and then analysing the users’ feedbacks. This information helped Nokia to develop a new market product where the product is no longer just a mobile phone, but also provides extra features like SMS, logos and additional ringing tones and screensavers (Nokia, 2002).

Trend Analysis of Costs versus Benefits

Any m-business constantly analyses whether the profits derived from the business is sustainable. Data mining can assist to do a trend analysis of the business over a period. A possible way to analyse the data collected from the profit derived from the business is the use of linear regression (a value prediction technique of the DM). A graph based on average returns versus average investment into the business can be plotted. Analysis of graph indicates whether the amount of investment incurred is greater or smaller than the returns derived. Many other business factors can also be considered during regression analysis.

Optimisation of Delivery Content

A mobile commerce platform should integrate with existing backend databases and businesses applications to deliver data via all the channels such as WAP, VoxML, TruSync, BlueTooth or any wireless protocol. Data mining can also be used to match which channel is best at a time to deliver the information. Data mining techniques can optimise the amount and format of the content for delivery based on the connection speed of the device requesting the information. Data mining techniques help to decide what tasks, activities and transactions are most economical and beneficial to use at the time.

Fraud Detection in M-Business

The analysis of the types of fraudulent activities in telecommunication systems is one of the applications that data mining can assist in a mobile environment. The dynamic nature of different fraudulent activities and the changes of the normal usage can lead in the detection of fraudulent through observing behavioural patterns. A data mining system will have plenty of examples of normal usage and some examples of fraud usage. Based on these previous examples, a predictive data mining system establishes facts about fraudulent activities. Whenever a change in the normal usage is detected, the system analyses the change, and is able to predict whether the change is a fraud or not.

DATA MINING CHALLENGES

In order to apply data mining efficiently in m-business, certain requirements have to be met. Ideally, the methods used for mining mobile data should be able to: (1) mine different kinds of knowledge in databases; (2) deal with diverse type of data types such as relational, temporal and spatial types of data; (3) mine information from heterogeneous databases and global information systems; (4) handle noise and incomplete data that is mostly the case in m-business domain; (5) perform the mining tasks efficiently regardless of the size and complexity of the data set; (6) support interactive mining of knowledge at multiple levels of abstraction; (7) support integration of the discovered knowledge with existing knowledge; and (8) deal with the issues related to applications of discovered knowledge and social impacts such as protection of data security, integrity and privacy.

Distributed Environment

In m-business environment, data can reside in many different geographical locations. Most data mining systems are currently based on centrally-located data; data is stored in a single database and the mining techniques are focused on this data set. XML is proving to be an essential way to perform data exchange not only on the Web but also wirelessly between applications or between users and applications. XML has provided the facilities to integrate data and documents to allow for data communication in “a flexible and extensible representation” (Graves, 2002). If every mobile device is able to transmit XML documents that can be read and processed, regardless which platform the mobile device is running on, data integration from multiple sources becomes easier.
But, as a result of convergence between computation and communication, the new data mining approaches have to be concerned with distributed aspects of computation and information storage. A distributed data mining approach typically works by: (1) analysing and compressing local data for minimisation of network traffic; and (2) analysing and generating global data models after combining local data and models (Park & Kargupta,2002).

Click stream Data

Users of mobile devices are highly restricted on the Web pages that they can visit, due to small display screen. On a WAP phone, the average number of links it has to other Web site is an average of five links, while a standard Web page has an average of 25 links. If a user is to have three clicks on the Web via a WAP phone, there are only 5 3 (= 125) pages that are accessible to the user, compared with the standard Web page having 25 3 (= 15625) accessible pages (Barnes, 2002). It is quite unlikely that the user will be going to the site that he really wants from the links available. As a result, the usage of data mining to analyse click stream data collected from users of mobile devices to predict the user’s interest is not going to be accurate.

Security and Privacy

With the technology of sending personal messages to mobile users, it has become possible for users to specify the types of information that they prefer, and for businesses to provide those information only. For example, if the user indicates that his preference is a particular brand of product above a particular price, then it can be analysed that the user may also be interested in another similar brand of the same standard. This opens data mining possibilities such as classifying the users based on their reported needs; finding correlations between various needs.
Unfortunately, some users who do not believe in the security of mobile data might inaccurately declare their personal information and preference. This will result in incorrect data mining output. Thus, although data mining results have classified the user as a potential person to send information to, but in reality, it can add expense to incur cost in conducting data mining and including the irrelevant people into the mobile service. A possible solution is use of XML that allows documents to be complex and tagged with unmeaningful names in data transfer. The document is not useful to an unauthorized person without the knowledge of how to transform (decrypt) the document appropriately.

Cost Justification

With the issue that a data mining application is usually computationally expensive, there is always a concern whether the benefits of data mining justify the cost incurred in the process. Also, there is a difficulty to strike a balance between the security and privacy of data transferred versus the computational cost required to process the “encrypted” documents. The more complex the communication document is with the concern of security and privacy, the more computational power is needed to process these documents.

Technological Limitation

Although there are a number of mobile technologies available, there exist several limitations and constraints of the technologies adversely affecting the performance of data mining in m-business domain. Some of the limitations are low bandwidth, limited battery power, unreliable communications that result in frequent disconnections. These factors increase the communication latency, additional cost to retransmit data, time-out delays, error control protocol processing and short disconnections (Madria, Mohania, Bhowmick, & Bhargava, 2002). These limitations pose significant problems in collecting data for mining purposes. For example, the present low bandwidth means that the data transfer is slow. This implies that data mining processes have to be delayed until most data transfers have been completed and received.
Furthermore, the potential of gathering knowledge about a user’s location is appealing in terms of m-business and data mining, this potential is not yet realized until present technologies improve to provide adequate and up to standard location.

FUTURE TRENDS AND CONCLUSION

The success of an m-business depends on the ability to deliver attractive products or services that are personalized to the individual user at the right time on the right location. These information intensive services can only be obtained by collecting and analysing combined demographic, geographic, and temporal data. This data can not be transformed into useful information with traditional reporting techniques and tools. Data mining enables the user to seek out facts by identifying patterns within data. Data mining can give businesses the edge over other businesses by offering marketing that is more focused on particular consumer groups or with suggesting the better use of mobile technology. An investment in data mining to m-business data is an extra expense, but can still help m-businesses to provide the right services to the right people at the right time, and that can make a vital difference.

KEY TERMS

Clustering: This data mining task is to identify items with similar characteristics, and thus creating a hierarchy of classes from the existing set of events. A data set is partitioned into segments of elements (homogeneous) that share a number of properties.
Data Mining: Data mining (DM) or knowledge discovery in databases is the extraction of interesting, meaningful, implicit, previously unknown, valid and actionable information from a pool of data sources.
Link Analysis: This data mining task establishes internal relationship to reveal hidden affinity among items in a given data set. Link analysis exposes samples and trends by predicting correlation of items that are otherwise not obvious.
Mobile Business Data: Data generating from mobile business activities such as accessing information, performing transactions and other operations from anywhere anytime via wireless networks.
Predictive Modelling: This data mining task makes predictions based on essential characteristics about the data. The classification task of data mining builds a model to map (or classify) a data item into one of several predefined classes. The regression task of data mining builds a model to map a data item to a real-valued prediction variable.

Next post:

Previous post: