icona researchResearch

Settore scientifico disciplinare di attività: INF_01 - INFORMATICA

Research interests

Data collected in most application domains are rich in volume and diversity. Consequently, generating value out of these rich and diverse data sets shares the 3V challenges ([V]olume, [V]elocity, and[V]ariety) of the so called “Big Data” applications. In order to support effective knowledge discovery, we must tackle additional, more specific, challenges, including those posed by the [H]igh-dimensional, [M]ulti-modal (temporal, spatial, hierarchical, and graph-structured), and inter-[L]inked nature of most multimedia data as well as the [I]mprecision of the media features and [S]parsity of the observations in the real-world. Moreover, since the end-users for most multimedia data exploration tasks are [H]uman beings, we need to consider additional fundamental constraints from the difficulties they face in providing unambiguous specifications of interest or preference, subjectivity in their interpretations of results, and their limitations in perception and memory. Last, but not the least, since a large portion of multimedia data is human-centered, we also need to account for the users’ (and others’) needs for [P]rivacy.

My research is partly supported by projects funded at the University of Torino, and partly supported by projects funded at the Arizona State University, in collaboration with Prof. K. Selcuk Candan.

The different projects we are involved in support our claim that different domains and disciplines, apparently far from each other (such as Building Energy Consumption Analysis and Study of the Infectious Disease Propagation) can benefit from "smart data oriented" fundamental technological innovations.

My major interests are in the following areas:

Scalable social media analysis.
Relevant ongoing projects:
MeSoOnTV: a media and social-driven ontology-based TV knowledge management system (Funded by RAI-CRIT)
Searching, browsing and analyzing web contents is today a challenging problem when compared to early Internet ages. This is due to the fact that web content is multimedial, social and dynamic. Moreover, concepts referred to by videos, news, comments, posts, are implicitly linked by the fact that people on the Web talk about something, somewhere at some time and these connections may change as the perception of users on the Web changes over time. The goal of the project is to define a model and develop a corresponding system for the integration of the heterogeneous and dynamic data coming from different knowledge sources (broadcasters' archives, online newspapers, blogs, web encyclopedias, social media platforms, social networks, etc.).

MIMOSA: Ontology-driven query system for the heterogeneous data of a SmArt City (Funded by the Compagnia di San Paolo, through the University of Torino)
“Mappe di Comunità 3.0” (Community Maps 3.0) is a participatory GIS prototype platform enabling communities of interests, governments and generic people to interact with multi-dimensional information spaces for retrieving and uploading geo-referenced information distributed in existing data sources as well as to discuss about the shared contents. MIMOSA aims at extending “Mappe di Comunità 3.0” with advanced information search facilities in order to enable people to interact with the 3D Community Map and to select the content to be visualized by using a clear and dynamic user interface. We will design and develop a proof of concept semantic engine that will exploit a multimodal domain ontology enabling the expression of the heterogeneous semantic contents related to the application domains, as well as the summaries of the multimedia data and external sources of information, to improve the users’ experience in browsing and searching data. A key component of the semantic engine will be a personalization module, which will combine the available integrated open data, the domain ontology and available private information characterizing users’ profiles and users’ interaction histories in order to manage a personalized shared information space.

RanKloud: Data Partitioning and Resource Allocation Strategies for Scalable Multimedia and Social Media Analysis (Funded by NSF)
Today, multimedia data are produced and consumed in massive quantities in a broad range of applications with significant economic and societal benefit, including e-commerce, surveillance, education, web services, and social media. Hence, there is an urgent need for systems to provide highly scalable processing and efficient analysis of large media data collections. The RanKloud prototype system, developed in this research project, focuses on the needs and requirements of applications that deal with large quantities of multimedia data in a cloud-based scalable environment.

Modeling and analysis of the spread of infectious diseases
Relevant ongoing project:
Data Management for Real-Time Data Driven Epidemic Spread Simulations (Funded by NSF)
The speed with which recent pandemics had immense global impact highlights the importance of realtime response and public health decision making, both at local and global levels. Existing software that enable model-driven epidemics and computer simulations for disease spreading help predict geo-temporal evolution of non-pharmaceutical control measures and interventions, relying on data and models including social contact networks, local and global mobility patterns of individuals, transmission and recovery rates, and outbreak conditions. If effectively leveraged, models reflecting past outbreaks, existing simulation traces obtained from simulation runs, and real-time observations incoming during an outbreak can be collectively used for obtaining a better understanding of the epidemic's characteristics and the underlying diffusion processes, forming and revising models, and performing exploratory, if-then type of hypothetical analyses of epidemic scenarios. We design and develop an epidemic simulation data management system (epiDMS) which addresses computational challenges that arise from the need to acquire, model, analyze, index, visualize, search, and recompose, in a scalable manner, large volumes of data that arise from observations and simulations during a disease outbreak.

Understanding the Evolution Patterns of the Ebola Outbreak in West-Africa and Supporting Real-Time Decision Making and Hypothesis Testing through Large Scale Simulations (Funded by NSF)
Global epidemic propagation occurs at multiple (local and global) scales: individuals within a subpopulation may be infected through local contacts during a local outbreak. These individuals then may carry the infection to a new region of the world, starting a new outbreak. Thus, disease spread simulations require data and models, including social contact networks, local and global mobility patterns of individuals, transmission and recovery rates, and outbreak conditions. Effectively managing the current emergency through real-time and continuous decision making requires computational models specifically tailored to the spatio-temporal dynamics of Ebola and data- and model-driven computer simulations for its spreading. Tools that help running and interpreting Ebola simulation ensembles (aligned with the real-world observations) to generate timely actionable results are critically needed. Given the urgency of this particular epidemic and the critical need for the development of the necessary models and tools specific to Ebola, this project focueses on Ebola transmission dynamics and control, specifically targeting products and processes for this Ebola epidemic. The research will result in novel algorithms and tools specially tailored for officials to continuously assess the impacts of different intervention scenarios and revise estimates based on real world data, at local and global scales, for the Ebola epidemic.

Data analysis in the context of energy building modelling.
Relevant ongoing project:
E-SDMS: Energy Simulation Data Management System Software (Funded by NSF)
Existing building energy management systems (BEMSs) need to integrate large volumes of data, including (a) continuously collected heating, ventilation, and air conditioning (HVAC) sensor and actuation data, (b) other sensory data, such as occupancy, humidity, lighting levels, air speed and quality, (c) architectural, mechanical, and building automation system configuration data for these buildings, (d) local wheather and GIS data that provide contextual information, as well as (e) energy price, consumption, and cost data from electricity (such as smart grid) and gas utilities. We design and decelop the energy simulation data management system (e-SDMS) software, addressing challenges that arise from the need to model, index, search, visualize, and analyze, in a scalable manner, large volumes of multi-variate series resulting from observations and simulations of enenerfy data. e-SDMS will, therefore, fill an important hole in data-driven building design and clean-energy (an area of national priority) and will enable applications and services with significant economic and environmental impact.

Internet of Things
Relevant projects:
Capability Assurance for Smart Living (Funded by Intel corporation, started in September 14).
This Joint Path Finding (JPF) project's goal is to develop and mature the technologies needed to create and sustain a smart living environment. Specifically, this JPF will study, within one year, the feasibility of the Internet of Things (IOT) as a class of intelligent devices to enable the smart living environment. Intel, ASU, and DCU have joined as a team to carry out this JPF with two objectives: one is to conduct research in IOT related technologies to enable smart living, and the other is to mature technologies through proof-of-concept (POC) demonstrations and pilot deployments. The team will use ASU’s Sun Devil stadium renovation project as the usage scenario to focus the JPF. The anticipated benefit of the JPF is two-fold: to accelerate the deployment of IOT technologies at the Sun Devil stadium and to establish smart living laboratories at ASU and DCU with Intel’s participation and guidance.

TA_SL: Tecnologie Abilitanti per la Sicurezza sul Lavoro (Funded by Regione Piemonte, ended in december 2013)
The project's focus is on the design and development of an innovative integrated technological platform, including both hardware and software components, to support, automatize, rationalize and make efficient the management and assurance of risk prevention policies for workers, with special attention to prevention of risks faced by workers in the construction domain.

Assistive technologies and accessibility to educational material
Metodologie, tecnologie, materiali e attivita'per un apprendimento della matematica accessibile e inclusivo
Progetto finanziato dalla fondazione CRT, coordinato dal dipartimento di Matematica dell'Universita' di Torino.

Foundational techniques for time series analysis, indexing, classification and summarization