Alfonso Semeraro

Via Pessinetto 12 · Torino, TO · alfonso.semeraro@gmail.com ·

PhD student at the Department of Computer Science of Università degli Studi di Torino.
Among other projects, I am currently working on the spreading of misinformation across different websites and social media platforms.
My research interests include complex networks, data mining and social media analysis.

Education

Università degli Studi di Torino

Master Degree in Computer Science

Thesis in Financial Networks

2014 - 2016

Università degli Studi di Torino

Bachelor Degree in Computer Science

Thesis in Emotional and Ethic Agents

2010 - 2013

Experience

PHD CANDIDATE

Università degli Studi di Torino

I am currently working on the problem of the spread of fake news across the web. I try to track and understand the whole process, from the fabrication of a piece of content to the sharing through different platforms, while also clarifying who are the most active and malicious users, searching for automation and coordination. I’m also working on other projects, as on financial networks and on the evolution of popularity in IT technologies.

October 2018 - Present

RESEARCH GRANT

Università degli Studi di Torino

I kept working on the joint project with Intesa Sanpaolo – Innovation Center on network analysis tools and techniques applied to financial data. I also lead a group of 4 collegues in running the NATO StratCom challenge 2018, responding to a call for proposals against the spreading of misinformation conveyed by visual contents. Our project reached the final, held in Riga on Dec 10th 2018.

March 2018 - March 2019

RESEARCH FELLOWSHIP

Università degli Studi di Torino

I worked as a research fellow on a project between University of Turin and Banca Intesa Sanpaolo – Innovation Center, aimed to gain an insight on the network that arises from the micropayments made by/to ISP bank accounts. Due to the huge volume and the fine grain of data, this analysis let us understand structure and dynamics of a relevant part of the Italian economic system.

January 2017 - March 2018

Works

[Jan 2019 – ongoing]
Facing the problem of visual misinformation: an actor based approach.

Multimedial fake news are a new, challenging problem. Unlike for textual hoaxes, there isn’t a search engine that allows to reconstruct the network of shares of a picture online. An actor based approach can shrink the perimeter in which to search, highlighting the presence of malicious, coordinated actors.
This project has been selected among the three finalists at NATO StratCOM Challenge 2018.

The spread of disinformation through fabricated or distorted multimedia contents is a challenging problem, with no simple solution, as most of current approaches fail to tell wether a picture conveys a false claim or not. An actor based approach can at least help to delimit a perimeter to search within for fabricated contents, and to reconstruct how a picture get to be shared. In order to highlight hidden but willful fake news broadcasters on social media, we employed a simple methodology: first we crawled well known fake news providers in search of a network of mutual references involving other unknown fake news providers, then we collected all the tweets from Twitter users that posted contents originated in those websites. We also developed an image similarity search engine, in order to help analysts to reconstruct the network of sharings of a picture, linking togheter all those accounts that shared the same piece of content even when there is not a direct retweet.

[Jan 2019 – ongoing]
Which one is a fake news? Intrinsic features and externalities in news perception.

How do we tell a news is real or fabricated? There are intrinsic features (like the source or the lexicon) and extrinsic features (like what people say about the news), and they both affect the way we perceive a piece of information, shaping our opinion about what is fake or not.

How do people tell a fabricated story from a real, documented news? Is the lexicon a hint? Do people read the whole article, or just its title? Do the source matters? Our opinion can be influenced not only by objective features intrinsic to the news, but also by externalities: as shown several times, social influence can shape our judgment, pushing us to adhere to other people’s opinion. We are working on a platform in which users are asked to tell wether a reported news is fake or not, while secretly changing features to show, user by user in a random way: the full article rather than a short abstract, the source, other people’s feedback - or a random noise - about a news. The impact of these changes on the accuracy scores of users is to be evaluated. This study may shed light about what matters more when we build an opinion about what we read online.

[Jan 2019 – ongoing]
On predicting the popularity of a programming language from Stackoverflow and GitHub activities.

Programming skills on a given language are a resource for IT companies, but they can be also an expensive asset to buy when it comes to change the language portfolio the company uses. Forecasting the success of a programming language in the future could be a crucial money-saver in such transitions.

How a programming language spreads over time, getting much attention year by year, until it becomes a skill required in many job offers? There are undeniable conveniences for a company in the adoption of a language backed by a sustained community of developers, making it even more popular and supported, in a rich-get-richer process, but it's hard to tell in a early stage which programming language will be the next big thing on the market. StackOverflow and GitHub are two very popular platforms supporting the activity of a programmer, the former by hosting a question and answer service about coding, the latter allowing programmers to backup and share their code. The amount of posts and repositories about a programming language can be easily used as proxy of the interest around it, and how it evolves year by year; also, important insights can be derived analyzing the network of the co-occurrences of coders in StackOverflow posts and GitHub repositories over different posts and repositories. Some of the repositories are marked as owned by companies, allowing for a influence on the market analysis. After collecting data about all the activities on both platforms from 2011 to 2019, we aim to find early indicators of a future success (or dismissing) of a programming language.

[Jan 2017 – Mar 2017]
Behavioral patterns in financial transaction network.

We analyzed a dataset provided by Banca Intesa Sanpaolo, listing 253 million wire transfers shared among 27 millions accounts. A structural analysis highlights the role of small and medium businesses as the class that bridges the connections between privates and corporates.

In the context of a joint project between University of Turin and Banca Intesa Sanpaolo, one of the most important Italian banks, we analyzed a huge anonymized dataset containing all payments done by or to customers during 2016. We hence built a network where each node represents a bank account, and each edge a wire transfer, and here we show the results of an early network analysis. Nodes in the network show different behaviors according to their degree: while big hubs (mostly companies and institutions) are not connected among them, and share most of their connections with smaller nodes, small nodes (mostly private customers) send and receive payments both toward high and low degree nodes; they are indeed more clustered, probably in families and local groups. A bowtie anaylsis shows the role of small and medium businesses as the class of nodes holding the connectivity of the whole network.