We developed a method for learning the distance metric between the customers of a Telecommunication Service Provider (TSP). This type of distance metric is important because it can be used as an underlying dissimilarity measure for solving a wide variety of problems such as Churn classification using Distance based learning algorithms(for example KNN), segmentation of customers and recommendation of new service-plans to the customers etc. To this end, we developed an approach that works at the intersection of Distance Metric Learning ,Social Network Analysis and Information theory. More specifically, I extracted the "relative similarity constraints" from the mobile social network of customers and used these constraints for learning the distance function between the feature vector representations of customers. We further proposed a method for selecting the most informative constraints from the available set of constraints using a entropy based method. We evaluated our method on customer churn classification task and our results show that using the learned distance metric instead of baseline euclidean distance function improved the F1 score by approximately 12%.
This patent pertains to forecasting whether a particular Internet application will be used by a user in the future time slots. This type of forecasting is important for appropriate tuning of communication network's parameters. A wide variety of machine learning algorithms can be applied for this type of task. However, some of the sate-of-the-art algorithms such as Hidden Markov Model have exponential computational complexity with respect to the number of Internet applications. To avoid this kind of high computation cost we proposed a preprocessing step in which we used the conditional entropy of the usage of different application to select a set of representative applications . In the next step. the forecasting model is build using only these representative applications. Further, the forecasting of usage of a non-representative applications is done based on the forecast of the corresponding representative application. We compared this method with the baseline method in which all the applications are used to train a machine learning model. Our studies have shown that this type of preprocessing allows us two achieve almost similar kind of forecasting performance as that of the baseline method but with significant reduction in training time.
We present an algorithm for identifying influence paths in a communication network comprising a plurality of users. The method comprises identifying network parameters indicative of a strength of connection between users of the network, combining the identified parameters to calculate a connection strength between users of the network, storing the calculated connection strengths as edge weights between the users of the network, identifying a source user and a target user for a path, and calculating a path between the source user and the target user according to the stored edge weights.