Accurate and timely understanding of mobile traffic consumption patterns, especially at city scale, is becoming increasingly important in the planning of digital infrastructure that can meet the diversifying requirements of end-user applications, including e-healthcare, autonomous cars, and industrial automation. Mobile traffic insights can further inform the provisioning and optimisation of public transportation systems, to reduce both their carbon footprint and citizens’ commute times. Nonetheless, in the context of the Covid pandemic, mobile traffic consumption can be also indicative of social events and interactions, hence a potentially useful signal in modelling disease spread.
Gaining fine-grained knowledge from mobile network traffic is not straightforward. To begin with, this requires dedicated equipment (probes) for measurements collection, which runs e.g. on network gateways. Secondly, processing vast amounts of data in a scalable and timely fashion is non-trivial, as this involves substantial local storage capabilities, costly overhead associated with transferring detailed logs to central locations for analysis, data filtering by scope, etc. To simplify traffic analysis, mobile operators often make simple assumptions about the distribution of data traffic consumption across cells. For instance, it is frequently assumed that users and traffic are uniformly distributed, irrespective of the geographical layout of coverage areas (Lee et al., 2014). Unfortunately, such approximations are usually highly inaccurate, as traffic volumes may exhibit considerable differences between nearby locations (Wang et al., 2015), and therefore have limited value.
To tackle this problem, our work published earlier in ACM CoNEXT’17 introduces the concept of mobile traffic super-resolution (MTSR), by which we infer narrowly localised traffic consumption from aggregate data recorded by a limited number of probes (thereby reducing deployment and measurement costs) that have arbitrary spatial granularity, which is reflective of practical coverage area sizes that depend on population density. We draw inspiration from image processing, and treat city-scale mobile traffic ‘snapshots’ similar to images, regarding a network measurement collection location similar to a pixel in an image, and the traffic volume at that location as the intensity of a pixel (see the similarities between the two problems as illustrated in Figure 1). We propose a novel Generative Adversarial neural Network (GAN) architecture tailored to MTSR, where high-resolution traffic maps are obtained through a generative model that receives coarse measurements collected by network probes and outputs approximations of the real traffic distribution. This is trained with a discriminative model that estimates the probability that a sample snapshot comes from a fine-grained ground truth measurements set, rather than being produced by the generator.
Figure 1: Illustration of the image super-resolution (SR) problem (above) and the underlying principle of the mobile traffic super-resolution (MTSR) technique (below).
The generator component of our GAN upgrades a ResNet model (He et al., 2016) with a set of additional “skipconnections”, without introducing extra parameters, while allowing gradients to backpropagate faster through the model in the training phrase. To train our architecture, which we named ZipNet, we rely on a data processing and augmentation approach that crops the original city-wide mobile data traffic snapshots to smaller size windows and repeats this process with different offsets to generate extra data points from the original ones, thereby maximising the usage of data available for training and preventing model overfitting.
Experiments we conduct with a real-world mobile traffic dataset, published by a major European operator, demonstrate the feasibility of using deep learning to infer fine-grained mobile traffic distributions with up orders of magnitude higher granularity as compared to standard probing, irrespective of the coverage and the location of the probes. Importantly, our ZipNet(-GAN) models achieve much lower reconstruction errors and higher fidelity of reconstructed traffic than existing interpolation techniques.
Another challenge is deciding where to collect measurements prior to performing any interpolation. Selecting sampling locations at random can overlook important spatiotemporal correlations specific to mobile traffic, leading to inaccurate reconstruction. Further, an important question to clarify is how many samples are needed to achieve high interpolation fidelity with the lowest possible measurement overhead.
Figure 2: The Spider framework: sparse mobile traffic snapshots used by a policy network to select optimal cells where to collect measurements; a dedicated reconstruction neural model outputs complete network traffic snapshots.
In our recent work, which was published in IEEE GLOBECOM’21, we put forward a deep learning-driven mobile traffic measurement collection and reconstruction framework, called Spider, which we illustrate in Figure 2. Our approach relies on a dedicated neural network that we train to selectively sample small subsets of target mobile coverage areas, so as to minimise overhead while acquiring enough data to ensure high-quality traffic consumption interpolations at all non-sampled cells. We take a Deep Reinforcement Learning (DRL) approach to produce examples for training this policy network. Given the large action space, our DRL agent learns in a tractable manner by sampling small subsets of the action space based on the most likely action and its nearest neighbors. The highest valued action from such subsets is then selected, circumventing the need to evaluate the entire action space.
For action evaluation and interpolation, we introduce a purpose-built mobile traffic reconstruction neural model (MTRNet), which builds on our earlier ZipNet design and similarly exploits spatiotemporal correlations within historical data. MTRNet outperforms existing methods in terms of accuracy (up to 67% lower Mean Absolute Error) and reconstruction speed (over 800× runtime reduction). By means of experiments with the same real mobile network dataset, we also showed that Spider requires on average 48% fewer samples to reconstruct complete traffic matrices, even when applied to previously unseen traffic patterns, such as those observed during holidays.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778.
Lee, D., Zhou, S., Zhong, X., Niu, Z., Zhou, X., and Zhang, H. (2014). Spatial modeling of the traffic density in cellular networks. IEEE Wireless Communications, 21(1):80–88.
Wang, H., Xu, F., Li, Y., Zhang, P., and Jin, D. (2015). Understanding mobile traffic patterns of large scale cellular towers in urban environment. In Proc. ACM IMC, pages 225–238.