knowledge centre

Big data will not automatically lead to deep insights

Justin van der Lande Principal Analyst, Research

CSPs have a significant and growing need to determine which factors are significant and, therefore, which data they should collect, model and analyse.


The move to store increasingly large quantities of data is in part welcome, but there is no guarantee that the data will be used to provide additional value to communications service providers (CSPs). There are few CSPs today that are not able to find more value in the data that they already have, and adding more data will not necessarily help.


To gain deeper insights, CSPs need to adopt one of two approaches:

  • using innovative tools that employ machine learning techniques to derive unseen correlations and patterns in the data
  • employing highly skilled teams for a given use case.

Machine learning will be essential to the analysis of large volumes of transient data

The sheer volume of data generated by telecoms operators, because of their unique access to all aspects of our digital lives, can be overwhelming for users. This data comes from a diverse range of sources, including:

  • sentiment analysis on social media
  • clickstream from our online activities
  • sensor reading from machines
  • billing data
  • call patterns, detailing who we interact with, and where and when we interact with them.

CSPs have a significant and growing need to determine which factors are significant and, therefore, which attributes they should collect, model and use. Matters are made worse when monitoring transient data because in particular because although data storage costs are falling, they are not free, so CSPs still have to decide which data should be stored even before it can be analysed.

When CSPs have decided which data to store, they are faced with the challenge of analysing it. Traditional approaches required the use of skilled staff that understand the data sets and, through trial and error, are able to create algorithms and models to predict or segment the data. When faced with potentially hundreds of attributes, this can be better achieved through the use of machine learning. These automated techniques provide clear guidance on which attributes are most significant and enable CSPs to create models based directly on this knowledge. Furthermore, applying machine learning to streaming data enables decisions to be made on transient data that need not be subsequently stored.

At the core of machine learning technology is a library of algorithms that can be applied to data given to them. Specialised algorithms can be applied to different requirements, such as finding influencers within social networks or identifying potential candidates for churn. The technology gains self-learning experience by processing actual data sets, and – in general – the larger the data sets, the more accurate the results.

Machine learning has several potential business uses in the telecoms sector, but can be most effective in cases where in-line analysis of streaming data and personalisation is required. Manual techniques for developing and refining models become uneconomic on a large scale, whereas the application of self-learnt modelling can scale to meet this challenge. This could enable CSPs to produce more-targeted offers, or provide tailored advertising to an individual, for example. The automation of the modelling also makes it possible to consider much more data – for example, metadata within photos or a fuller range of network data.

Experts can provide further insight based on deep experience and industry knowledge

Machine learning is not a panacea for CSPs that are trying to deploy analytics to improve their organisation. It needs a supply of data in order to self-learn, and data is not always available when launching new services, targeting new markets or assessing the potential impact of new technology. Skilled staff with deep experience can provide a more insightful approach to a given issue.

The most common issues have often been addressed many times before, and products and knowledge have been applied to provide a robust, low-risk and quicker-to-deploy solution than building models from scratch. The ability to take and integrate an off-the-shelf application that provides deep insights into specific use cases can outweigh the flexibility found in more-generic tools and systems.