Ph.D. courses a.y. 2024/2025 - Area Dottorato

A basic and concise introduction to Topological Data Analysis

Lecturer: Patrizio Frosini (UNIPI)

Contact: patrizio.frosini@unipi.it

Schedule:

JAN 8th 11am-1pm sala seminari est

JAN 10th 11am-1pm sala seminari est

JAN 13th 11am-1pm sala seminari est

JAN 15th 11am-1pm sala seminari est

JAN 17th 11am-1pm sala seminari est

JAN 20th 11am-1pm sala seminari est

JAN 22nd 11am-1pm sala seminari est

JAN 24th 11am-1pm sala seminari est

Topological Data Analysis (TDA) is a mathematical framework focused on studying and quantifying the “shape” of data. Its primary goal is to describe and measure the similarity in datasets by using distances, particularly when equivalences are defined through geometric transformations. Additionally, TDA is highly effective for reducing the dimensionality of data, making it easier to analyze and compare. It can also be utilized in geometric machine learning, and its approach can be applied to a wide range of data types, including time series, 2D and 3D objects, and point clouds. Throughout the course, fundamental concepts required for a basic understanding of TDA will be introduced, with a focus on practical and computational examples, rather than formal mathematical theory.

Topics:

Equivalence and non-equivalence of data with respect to the action of a group of transformations.
Simplicial complexes as a generalization of the concept of a graph and as a geometric representation of data described by point clouds in Euclidean spaces.
Simplicial homology groups as a method for representing the “shape” of a simplicial complex derived from a point cloud.
The need to adapt homology to the observer’s point of view and the presence of noise: an introduction to persistent homology and persistence diagrams.
Stability of persistence diagrams in the presence of noise.
Applications of persistent homology.
From the shape of data to the shape of observers: the concept of a Group Equivariant Non-Expansive Operator (GENEO).
The problem of approximating observers in the space of GENEOs.
GENEOs as a geometric method for reducing the number of parameters in neural networks and increasing their interpretability.
Application of GENEO theory to identify pockets in proteins and the implementation of GENEO networks for geometric Machine Learning.

Distributed Ledger Technology data: management and analysis

Lecturers: Damiano Di Francesco Maesa (UNIPI), Matteo Loporchio (UNIPI)

Contact: damiano.difrancesco@unipi.it, matteo.loporchio@phd.unipi.it

Schedule

JAN 27th 2pm – 5pm sala seminari est

JAN 28th 10am-1pm sala seminari est

JAN 29th 10am-1pm sala seminari est

JAN 30th 10am-1pm sala seminari est

JAN 31th 10am-1pm sala seminari est

JAN 31th 2pm-4pm sala seminari est

The goal of this course is to present how data is managed (represented, secured, and retrieved) in Distributed Ledger Technology (DLT) based systems, and how data can be analysed to study the ecosystems they support. The course will start with an introduction to the concepts behind DLT, including its main implementations, novel properties, and innovative applications. We will then present the two most famous blockchain protocols, Bitcoin and Ethereum, outlining how they manage their internal transaction data. This same data will be focus of the following lectures showcasing how to represent and analyse it through graphs. The course will close by presenting how authenticated data structures can be leveraged to enhance such data management.

The course final exam will either be a project or seminar depending on each student preference.

Pathways to Green ICT

Lecturers: Antonio Brogi (UNIPI), Stefano Forti (UNIPI)

Contact: antonio.brogi@unipi.it, stefano.forti@unipi.it

Schedule

FEB 11th 11am-1pm sala seminari est

FEB 12th 11am-1pm sala seminari est

FEB 13th 11am-1pm sala seminari est

FEB 14th 11am-1pm sala seminari est

MAR 04th 11am-1pm sala seminari est

MAR 05th 11am-1pm sala seminari est

MAR 06th 11am-1pm sala seminari est

MAR 07th 11am-1pm sala seminari est

The course aims at introducing students to the fundamentals of Green ICT, providing them with a toolbox to consider sustainability aspects in their research. The course will introduce:

-The concepts of sustainability and the types of environmental impact of the lifecycle of ICT systems (power consumption, carbon emissions, e-waste)

– Methodologies to assess the environmental impact of ICT systems (from production to operation and maintenance to disposal)

– Methodologies to decrease the environmental impact of ICT systems (orthogonality of QoS and environmental goals, hardware selection and PUE reduction, energy-aware programming, green software engineering, energy-aware system deployment)

– Use cases and open research challenges

Program analysis

Lecturers: Roberto Bruni (UNIPI), Roberta Gori (UNIPI)

Contact: roberto.bruni@unipi.it, roberta.gori@phd.unipi.it

Tentative schedule

MAR 12th 11am-1pm sala seminari est
MAR 13th 11am-1pm sala seminari est
MAR 19th 11am-1pm sala seminari est
MAR 20th 11am-1pm sala seminari est
MAR 26th 11am-1pm sala seminari est
MAR 27th 11am-1pm sala seminari est
APR 02nd 11am-1pm sala seminari est
APR 03rd 11am-1pm sala seminari est

This course offers a focused exploration of formal methods in software development, with some emphasis on the shift of perspectives after Peter O’Hearn’s influential paper on incorrectness logic. Instead of exploiting over-approximations to prove program correctness like done with classical formal methods, incorrectness reasoning exploits under-approximations for exposing true bugs.

The overall goal of incorrectness methods is to develop principled techniques to assist programmers with timely feedback about the presence of true errors, with few or zero false alarms.

The course will overview different approaches, like program logics, pointer analysis, and abstract interpretation, for both over- and under-approximation, as well as their combination.

Analysis techniques for transfer learning in Neural Tangent Kernel Regime
Pietro Cassarà (CNR-ISTI), Dario Trevisan (UNIPI)

Contact: pietro.cassara@isti.cnr.it , dario.trevisan@unipi.it

Schedule

APR 28th 4pm-6pm sala seminari est

APR 29th 4pm-6pm sala seminari est

APR 30th 4pm-6pm sala seminari est

May 5th 4pm-6pm sala seminari est

May 7th 4pm-6pm sala seminari est

May 8th 4pm-6pm sala seminari est

May 12th 4pm-6pm sala seminari est

May 15th 4pm-6pm sala seminari est

Knowledge transfer learning consists of training a simpler model, mimicking the output of a more complex one, even using heterogeneous information. This approach is investigated because some results show that transfer learning speeds up the training process and improves the generalization of a new learning model

using the soft labels generated by the complex model. This feature makes this kind of technique suitable for semisupervised, unsupervised learning techniques, and distributed learning applications. Although transfer learning is widely used in application fields. such as networking and decision support systems, no satisfactory theoretical explanation has yet been found; since, there is a lack of design techniques for this type of learning model. The course focuses on the mathematical tools that can be exploited for the theoretical analysis of transfer learning, starting with the results based on the spectral analysis.

Programming Tools and Techniques in the Pervasive Parallelism Era

Lecturers: Marco Danelutto (UNIPI), Patrizio Dazzi (UNIPI)

Contact: marco.danelutto@unipi.it, patrizio.dazzi@unipi.it

Schedule

May 6th 11am-1pm sala seminari est

May 7th 11am-1pm sala seminari est

May 8th 11am-1pm sala seminari est

May 9th 11am-1pm sala seminari est

May 12th 11am-1pm sala seminari est

May 13th 11am-1pm sala seminari est

May 14th 11am-1pm sala seminari est

May 15th 11am-1pm sala seminari est

The course covers techniques and tools (already existing or that are in the process of being moved to mainstream) suitable to support the implementation of efficient parallel/distributed applications targeting small scale parallel systems as well as larger scale parallel and distributed systems, possibly equipped with different kind of accelerators. The course follows a methodological approach to provide a homogeneous overview of classical tools and techniques as well as of new tools and techniques specifically developed for new, emerging architectures and applicative domains. Perspectives in the direction of reconfigurable coprocessors and domain-specific architectures will also be covered.

3D Geometry Representation and Processing for Deep Learning
Lecturers: Paolo Cignoni, Massimiliano Corsini, Daniela Giorgi, Luigi Malomo (CNR-ISTI)

Contact: paolo.cignoni@isti.cnr.it, massimiliano.corsini@isti.cnr.it, daniela.giorgi@isti.cnr.it, luigi.malomo@isti.cnr.it

Schedule

MAY 07th       9am-11am       sala seminari est
MAY 07th       2pm-4pm        sala seminari est
MAY 09th       9am-11am       sala seminari est
MAY 09th       2pm-4pm        sala seminari est
MAY 28th       9am-10am       sala seminari est
MAY 28th       2pm-5pm        sala seminari est
MAY 30th       9am-11am       aula FIB-M1
MAY 30th       2pm-4pm aula FIB-M1

Computer Graphics and Geometry Processing are the main disciplines dealing with 3D data such as meshes and point clouds. In turn, Artificial Intelligence and Deep Learning are fundamental paradigms to manage visual data. Nevertheless, applying traditional learning paradigms on 3D data requires rethinking architectural building blocks designed for 2D images, such as convolution and pooling operators, as well as attention layers.
In this course, we will introduce different representations for 3D data, and basic geometry processing techniques that intervene in deep learning pipelines (sampling, remeshing, conversion, …). Then, we will introduce methods able to learn tasks on 3D data. We will describe different architectures to process complex geometric domains, and the novel mechanisms introduced in the literature to preserve by design their intrinsic properties. Examples include graph learning techniques, augmented with geometric and topological information; attention modules to process unordered point sets and mesh data; transformer-like architectures for unstructured data.

In the second part of the course, we will discuss different applications where the interplay between Computer Graphics/Geometry Processing and Deep Learning is opening up to exciting results, including Computational Fabrication, Architectural Geometry, and Environmental Monitoring.

Computational Modeling for Systems Biology

Lecturers: Paolo Milazzo (UNIPI), Silvia Galfré (UNIPI)

Contact: paolo.milazzo@unipi.it

Schedule:

MAY 27th 10am-1pm sala seminari est

MAY 28th 10am-1pm sala seminari est

MAY 29th 2pm-4pm sala seminari est

JUN 03rd 10am-1pm sala seminari est

JUN 04th 10am-1pm sala seminari est

JUN 05th 2pm-4pm sala seminari est

The course will deal with several aspects of the in-silico analysis of dynamical properties of biological systems. We will focus, in particular, on mechanistic modeling approaches aiming at creating executable representations of the biological mechanisms and processes underlying cell functioning. After providing a few notions of biochemistry and cell biology, we will examine modeling methods for gene regulatory networks with particular emphasis on Boolean network models and rule-based approaches. Next, we will present approaches suitable for the analysis of metabolic and cell-signaling processes, ranging from differential equations, to stochastic modeling and simulation methods, to hybrid approaches. Finally, we will briefly survey emerging methods in computational structural biology, such as methods for protein structure prediction and molecular dynamics simulation, and we discuss how these techniques could be integrated with the previous ones in order to evaluate the impact of protein mutations on cell functioning.

Challenges in Modern Information Retrieval

Lecturers: Franco Maria Nardini, Cosimo Rulli, Salvatore Trani (CNR-ISTI), Rossano Venturini (UNIPI)

Contact: francomaria.nardini@isti.cnr.it

Schedule:

June 24th, 25th, 26th, 27th 9am-1pm sala seminari est

This PhD course focuses on Information Retrieval and discusses the state-of-the-art and the challenges in the two main areas of Web search: i) indexing and ii) query processing. The course introduces each area by discussing the state of the art in the field and by presenting the open research questions. The course emphasizes query processing, a research line where machine learning is important to advance the state of the art. After introducing the different query processing techniques, the course introduces supervised techniques explicitly focused on targeting the ranking problem and discusses several time and space efficiency/effectiveness trade-offs in query processing. The course will also provide an in-depth analysis of query processing techniques employing transformer-based large language models. Four hands-on sessions will cover indexing and query processing of public Web collections.

Course Contents

Modern Information Retrieval (4 hours)

o The web: history, peculiarities and the importance of the search.

o Data structures for indexing Web documents

o Modern techniques for efficient text retrieval

o Data compression for integers, sequences of integers, and vectors

o Challenges in indexing the Web

o Hands-On: Indexing and basic query processing on a public Web collection

Machine learning in modern query processors (4 hours)

o Machine learning approaches for IR: Learning to Rank

o Efficiency/effectiveness trade-offs and cascading architectures

o Hands-On: Learning to rank for efficient Web search

Neural Information Retrieval I (4 hours)

o Neural information retrieval

o The role of transformers in modern Web Search

o Interaction-based methods vs. representation-based methods

o Efficient query processing with interaction-based methods

o Hands-On: Deep neural networks for efficient Web search

Neural Information Retrieval II (4 hours)

o Transformer-based large language models as text encoders

o Sparse, dense, and multi-vector representations

o Data structures for efficient k-NN search and retrieval over learned representations

o Quantization techniques

o Hands-On: Encoding and retrieving over learned sparse representations