Publications

Cognitive Networks Extract Insights on COVID-19 Vaccines from English and Italian Popular Tweets: Anticipation, Logistics, Conspiracy and Loss of Trust

Published in Environment and Planning B: Urban Analytics and City Science, 2022

Monitoring social discourse about COVID-19 vaccines is key to understanding how large populations perceive vaccination campaigns. This work reconstructs how popular and trending posts framed semantically and emotionally COVID-19 vaccines on Twitter. We achieve this by merging natural language processing, cognitive network science and AI-based image analysis. We focus on 4765 unique popular tweets in English or Italian about COVID-19 vaccines between December 2020 and March 2021. One popular English tweet contained in our data set was liked around 495,000 times, highlighting how popular tweets could cognitively affect large parts of the population. We investigate both text and multimedia content in tweets and build a cognitive network of syntactic/semantic associations in messages, including emotional cues and pictures. This network representation indicates how online users linked ideas in social discourse and framed vaccines along specific semantic/emotional content. The English semantic frame of “vaccine” was highly polarised between trust/anticipation (towards the vaccine as a scientific asset saving lives) and anger/sadness (mentioning critical issues with dose administering). Semantic associations with “vaccine,” “hoax” and conspiratorial jargon indicated the persistence of conspiracy theories and vaccines in extremely popular English posts. Interestingly, these were absent in Italian messages. Popular tweets with images of people wearing face masks used language that lacked the trust and joy found in tweets showing people with no masks. This difference indicates a negative effect attributed to face-covering in social discourse. Behavioural analysis revealed a tendency for users to share content eliciting joy, sadness and disgust and to like sad messages less. Both patterns indicate an interplay between emotions and content diffusion beyond sentiment. After its suspension in mid-March 2021, “AstraZeneca” was associated with trustful language driven by experts. After the deaths of a small number of vaccinated people in mid-March, popular Italian tweets framed “vaccine” by crucially replacing earlier levels of trust with deep sadness. Our results stress how cognitive networks and innovative multimedia processing open new ways for reconstructing online perceptions about vaccines and trust.

Recommended citation: Stella, M., Vitevitch, M. S., & Botta, F. (2022). Cognitive Networks Extract Insights on COVID-19 Vaccines from English and Italian Popular Tweets: Anticipation, Logistics, Conspiracy and Loss of Trust. Big Data and Cognitive Computing, 6(2), 52.

Rapid indicators of deprivation using grocery shopping data

Published in Royal Society Open Science, 2021

Measuring socio-economic indicators is a crucial task for policy makers who need to develop and implement policies aimed at reducing inequalities and improving the quality of life. However, traditionally this is a time-consuming and expensive task, which therefore cannot be carried out with high temporal frequency. Here, we investigate whether secondary data generated from our grocery shopping habits can be used to generate rapid estimates of deprivation in the city of London in the UK. We show the existence of a relationship between our grocery shopping data and the deprivation of different areas in London, and how we can use grocery shopping data to generate quick estimates of deprivation, albeit with some limitations. Crucially, our estimates can be generated very rapidly with the data used in our analysis, thus opening up the opportunity of having early access to estimates of deprivation. Our findings provide further evidence that new data streams contain accurate information about our collective behaviour and the current state of our society.

Recommended citation: Bannister, A., & Botta, F. (2021). Rapid indicators of deprivation using grocery shopping data. Royal Society open science, 8(12), 211069. https://royalsocietypublishing.org/doi/10.1098/rsos.211069

Quantifying the differences in Call Detail Records

Published in Royal Society Open Science, 2021

The increasing availability of mobile phone data has attracted the attention of several researchers interested in studying our collective behaviour. Our interactions with the phone network can take several forms, from SMS messages to phone calls and data usage. Typically, mobile phone data are released to researchers in the form of call detail records, which contain records of different types of interactions, and can be used to analyse various aspects of our behaviour. However, the inherently behavioural nature of these interactions may result in differences between how we make phone calls and receive text messages. Studies which rely on data derived from these interactions, therefore, need to carefully consider these differences. Here, we aim to investigate differences and limitations of different types of mobile phone interactions data by analysing a large mobile phone dataset. We study the relationship between different types of interactions and show how it changes over time. We anticipate our findings to be of interest to all researchers working in the area of computational social science.

Recommended citation: Botta, F. (2021). Quantifying the differences in call detail records. Royal Society Open Science, 8(6), 201443.

Modelling urban vibrancy with mobile phone and OpenStreetMap data

Published in PLOS ONE, 2021

The concept of urban vibrancy has become increasingly important in the study of cities. A vibrant urban environment is an area of a city with high levels of human activity and interactions. Traditionally, studying our cities and what makes them vibrant has been very difficult, due to challenges in data collection on urban environments and people’s location and interactions. Here, we rely on novel sources of data to investigate how different features of our cities may relate to urban vibrancy. In particular, we explore whether there are any differences in which urban features make an environment vibrant for different age groups. We perform this quantitative analysis by extracting urban features from OpenStreetMap and the Italian census, and using them in spatial models to describe urban vibrancy. Our analysis shows a strong relationship between urban features and urban vibrancy, and particularly highlights the importance of third places, which are urban places offering opportunities for social interactions. Our findings provide evidence that a combination of mobile phone data with crowdsourced urban features can be used to better understand urban vibrancy.

Recommended citation: Botta, F., & Gutiérrez-Roig, M. (2021). Modelling urban vibrancy with mobile phone and OpenStreetMap data. Plos one, 16(6), e0252015. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252015

In search of art: rapid estimates of gallery and museum visits using Google Trends

Published in EPJ Data Science, 2020

Measuring collective human behaviour has traditionally been a time-consuming and expensive process, impairing the speed at which data can be made available to decision makers in policy. Can data generated through widespread use of online services help provide faster insights? Here, we consider an example relating to policymaking for culture and the arts: publicly funded museums and galleries in the UK. We show that data on Google searches for museums and galleries can be used to generate estimates of their visitor numbers. Crucially, we find that these estimates can be generated faster than traditional measurements, thus offering policymakers early insights into changes in cultural participation supported by public funds. Our findings provide further evidence that data on our use of online services can help generate timely indicators of changes in society, so that decision makers can focus on the present rather than the past.

Recommended citation: Botta, F., Preis, T., & Moat, H. S. (2020). In search of art: rapid estimates of gallery and museum visits using Google Trends. EPJ Data Science, 9(1), 14. https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-020-00232-z

Measuring the size of a crowd using Instagram

Published in Environment and Planning B: Urban Analytics and City Science, 2019

Measuring the size of a crowd in a specific location can be of crucial importance for crowd management, in particular in emergency situations. Here, using two football stadiums as case studies, we present evidence that data generated through interactions with the social media platform Instagram can be used to generate estimates of the size of a crowd. We present a detailed analysis of the impact of varying the time period and spatial area considered for the collection of Instagram data. Crucially, we demonstrate how to address issues that arise from changes in the usage of a social media platform such as Instagram. Our findings show how social media datasets carrying location-based information may help provide near to real-time measurements of the size of a crowd.

Recommended citation: Botta, F., Moat, H. S., & Preis, T. (2019). Measuring the size of a crowd using Instagram. Environment and Planning B: Urban Analytics and City Science, 2399808319841615.

Analysis of the communities of an urban mobile phone network

Published in PLOS ONE, 2017

Being able to characterise the patterns of communications between individuals across different time scales is of great importance in understanding people’s social interactions. Here, we present a detailed analysis of the community structure of the network of mobile phone calls in the metropolitan area of Milan revealing temporal patterns of communications between people. We show that circadian and weekly patterns can be found in the evolution of communities, presenting evidence that these cycles arise not only at the individual level but also at that of social groups. Our findings suggest that these trends are present across a range of time scales, from hours to days and weeks, and can be used to detect socially relevant events.

Recommended citation: Botta, F., & del Genio, C. I. (2017). Analysis of the communities of an urban mobile phone network. PloS one, 12(3), e0174198. https://doi.org/10.1371/journal.pone.0174198

Finding network communities using modularity density

Published in Journal of Statistical Mechanics: Theory and Experiment, 2016

Many real-world complex networks exhibit a community structure, in which the modules correspond to actual functional units. Identifying these communities is a key challenge for scientists. A common approach is to search for the network partition that maximizes a quality function. Here, we present a detailed analysis of a recently proposed function, namely modularity density. We show that it does not incur in the drawbacks suffered by traditional modularity, and that it can identify networks without ground-truth community structure, deriving its analytical dependence on link density in generic random graphs. In addition, we show that modularity density allows an easy comparison between networks of different sizes, and we also present some limitations that methods based on modularity density may suffer from. Finally, we introduce an efficient, quadratic community detection algorithm based on modularity density maximization, validating its accuracy against theoretical predictions and on a set of benchmark networks.
An implementation of the algorithm presented in this paper is available on my GitHub page .

Recommended citation: Botta, F., & del Genio, C. I. (2016). Finding network communities using modularity density. Journal of Statistical Mechanics: Theory and Experiment, 2016(12), 123402. https://arxiv.org/pdf/1612.07297.pdf

Quantifying stock return distributions in financial markets

Published in PLOS ONE, 2015

Being able to quantify the probability of large price changes in stock markets is of crucial importance in understanding financial crises that affect the lives of people worldwide. Large changes in stock market prices can arise abruptly, within a matter of minutes, or develop across much longer time scales. Here, we analyze a dataset comprising the stocks forming the Dow Jones Industrial Average at a second by second resolution in the period from January 2008 to July 2010 in order to quantify the distribution of changes in market prices at a range of time scales. We find that the tails of the distributions of logarithmic price changes, or returns, exhibit power law decays for time scales ranging from 300 seconds to 3600 seconds. For larger time scales, we find that the distributions tails exhibit exponential decay. Our findings may inform the development of models of market behavior across varying time scales.

Recommended citation: Botta, F., Moat, H. S., Stanley, H. E., & Preis, T. (2015). Quantifying stock return distributions in financial markets. PloS one, 10(9), e0135600. https://doi.org/10.1371/journal.pone.0135600

Quantifying crowd size with mobile phone and Twitter data

Published in Royal Society Open Science, 2015

Being able to infer the number of people in a specific area is of extreme importance for the avoidance of crowd disasters and to facilitate emergency evacuations. Here, using a football stadium and an airport as case studies, we present evidence of a strong relationship between the number of people in restricted areas and activity recorded by mobile phone providers and the online service Twitter. Our findings suggest that data generated through our interactions with mobile phone networks and the Internet may allow us to gain valuable measurements of the current state of society.

Recommended citation: Botta, F., Moat, H. S., & Preis, T. (2015). Quantifying crowd size with mobile phone and Twitter data. Royal Society open science, 2(5), 150162. 10.1098/rsos.150162