About me

Profile image

I am an engineer with a strong foundation in information theory, earned during my Bachelor’s degree at the University of Padova , and a deep passion for audio that led me to pursue a Master’s degree in Music and Acoustic Engineering at Politecnico di Milano. My academic journey has equipped me with a solid understanding of deep learning, machine learning, control systems, signal processing, and electronics. Alongside my studies, I have continuously advanced my professional development, working in the IT sector as a full-stack developer, software architect, and DevOps engineer. I am now seeking a role in the deep learning field, ideally in research or engineering, with a focus on generative AI: my aim is to follow a path that aligns with the work done in my Master’s thesis , and to further build my skills in these areas. Feel free to review my resume or my full CV for more details about my background and experience.

Research Activity

Latent Space Regularization via Normalizing Attribute Transformations

This research project in Deep Learning builds on top of the variational information bottleneck (VIB) framework (a generalization of the well known VAE) and it uses a supervised multi-objective learning approach that allows to encode a particular attribute relative to the input in a designated dimension of the latent space by introducing a regularization term. This offer precise control over a specific attribute of the content to generate by manipulating latent variables unlike prompt-based generative models that often lack this ability. The novel contribution of this research consists in the introduction of an invertible transformation for the univariate distribution of the attribute to encode: the goal of this proposal is to make this distribution as close as possible to the prior distribution used choose for the latent space, according to the idea that this will help the multi-objective optimization process by making the new regularization synergic to the Kullback-Leibler divergence in the original VIB loss function; moreover, being the transformation invertible, it allows to map the encoded property back to the original domain improving the model interpretability. The method has been applied in the symbolic music domain, particularly in the task of generating 4 bars melodies and the dataset created to train the models is publicly available on Zenodo (link above).

Keywords: symbolic music, attribute-controlled generation, data gaussianization

Resolv - Machine Learning Development for Research

Resolv is a system designed to streamline machine learning and deep learning workflows for research purposes. The focus is on providing a comprehensive platform that facilitates model development, data processing, and infrastructure management. The system is organized into multiple components, each handling a specific part of the machine learning workflow. The goal is to create a flexible, scalable and reusable system that supports model experimentation, improves data processing through well-structured pipelines, and automates orchestration using modern tools like Keras , Apache Beam and Apache Airflow . Resolv is built to serve as a flexible and modular system aimed at supporting research in machine learning and deep learning: the organization of the system allows for clear separation of tasks by making it easier to experiment with new ides and techniques.

Keywords: deep learning, keras, apache-beam, apache-airflow

Do Unconditional Deep Generative Models Spontaneously Learn How to Encode Human-Interpretable Musical Attributes?

This study is focused on investigating the presence of any correlations between the topological structure of the latent space and high-level features of the output. This project supplies a valid starting point for the systematic sampling of the pre-trained model of MusicVAE, a β-VAE by Magenta . It offers ready-to-use tools for the analysis of the 2-bar and 16-bar pre-trained model configurations and it employ Latin Hypercube Sampling to perform latent space regularization, which leads to an explicit control between the output’s characteristics and the locations of the samples within the embedding. The available tools automate sampling, output feature calculations and evaluation. The purpose of this work is to unveil the presence of eventual correlations between the topological structure of the latent space and high level features of the output, which could be relevant for future works involving the refining of post hoc conditioning in generative models.

Keywords: variational autoencoders, latent space topological structure

Fonti 4.0 - Evaluation of the performance of commercial STT and NER services

The FONTI 4.0 project aims at exploring the suitability of automatic transcription and information extraction technologies for making historical oral sources available. In this work, we conducted an experiment to test the performance of two commercial speech-to-text services - Google Cloud Speech-to-text and Amazon Transcribe - on digitized oral sources. We created an eight-hour corpus made of manually transcribed and annotated historical speech recordings in TEI format. The results clearly show how audio quality and disturbing elements (e.g., overlaps, foreign words, etc.) impact on the automatic transcription, showing what needs to be improved for implementing an unsupervised transcription chain.

Keywords: speech-to-text, named-entity-recognition, gcp, aws

Professional Experience

I bring years of experience in software development, architecture, and DevOps, primarily through consultancy roles within the Arsenalia Group , which includes the companies Alpenite and ccelera .

My work has focused on full-stack development, especially on the SAP Hybris Commerce platform, where I delivered B2B and CRM solutions, integrated ticketing and notification systems, and developed mobile capabilities with PWA frameworks. I also took on responsibilities in software architecture, collaborating with international teams and working in London to implement logistics and retail integrations through Mulesoft middleware, using AMQP and REST/SOAP APIs. In my role at ccelera, I continued my full-stack development on SAP Hybris Commerce, working closely with high-profile clients to deliver tailored e-commerce solutions.

Additionally, at Walit , I expanded my expertise in cloud infrastructure by designing and maintaining Google Cloud environments for deep learning projects, setting up CI/CD pipelines in Gitlab, and ensuring application security with OWASP ZAP.

Main clients: Kering Eyewear, Stella McCartney, Bonfiglioli, Cellularline and PegPerego

Creative Projects

Ego is a project that is meant to explore the idea of human perception, in particular the idea of identity and self consciousness and the way it is distorted and biased. The project has been implemented as a webapp to make the experience available to everyone, but it can also be imagined in the context of an artistic installation: the user’s first approach would be defined by the vision of a cloud of swirling particles, while an undefined drone sound is playing. Once the user is close enough to be detected by our system the cloud will start to slowly morph into a shape resembling its face: the user is now witnessing their identity take shape. A repeating melody would be heard, whose generation is based on the user’s face, in particular on their somatic traits and their current mood. The intrigued user might now get even closer, in which case they’ll hear the music become more and more intense, in a higher pitch and faster, representing an ever-growing feedback loop of self consciousness. The final phase of Ego will show the user their face being morphed into a new shape, reminiscent of a Rorschach Test , to finally express the idea of the distortion of the self caused by the filtering of external point of view.

Keywords: three.js, glsl, svelte, mediapipe, max4live, tone.js

Pulseq is a fractal sequencer implemented as a single-page application (SPA) and inspired by the Eurorack module Bloom by Qu-Bit Electronix . It uses a user-constructed sequence to recursively generate a tree of related subsequences that are strung together to create large musical sequences that relate back to the originally programmed sequence. First, the user programs the main sequence which from now on will (called trunk) than the branches and path knobs can be used to control the tree generation. Beside these knobs, the user has the possibility to play the sequences with eight different sounds and to apply three different pre-set effects: a reverb, a ping-pong delay and a distortion.

Keywords: svelte, tone.js, glsl

Computer Music Projects

OranJam

OranJam GUI

OranJam is an audio plugin that implements a polyphonic subtractive synthesizer. The software is implemented in C++ with the aid of JUCE . The DSP chain includes several functional blocks: an oscillators bank, a white noise generator, an ADSR envelope, a filters bank and a LFO. A GUI enables the user to control the available parameters, such as waveform for each oscillator, cutoff frequency and resonance for each filter.

Keywords: juce, c++, cmake

HarMMMLonizer

HarMMMLonizer GUI

HarMMMLonizer is a real-time harmonizer implemented in SuperCollider . The software implements a DSP system featuring mono input and stereo output. The DSP chain includes a delay line block which supports different feedback setups. Furthermore, the GUI enables the musician to control available parameters, each specifically related to pitch shifting, delay effect and master. HarMMMLonizer supports three additional pitched voices to build the harmony, but a global variable within the code enables the programmer to change the number of voices.

Keywords: supercollider, harmonizer, delay lines, crosstalk delay feedback

Sound Analysis Synthesis & Processing Projects

  • Wave Digital Filter Modeling

    Design of a three-way crossover network in the Wave Digital (WD) domain starting from a reference analog circuit. The model is then implemented in MATLAB using the trapezoidal discretization method (bilinear transformation).
    Keywords: matlab, virtual analog
  • Leslie Speaker Emulation

    Efficient implementation of the Leslie rotary speaker as a digital audio effect.
    Keywords: matlab, digital audio effect
  • Acoustic Source Localization with Microphone Array

    Acoustic source localization using the Delay-And-Sum (DAS) beamformer and the MUSIC methods to estimate the Direction Of Arrival (DOA) of two audio sources sampled by a 64 microphones array.
    Keywords: sound localization, doa estimation, matlab
  • RIR Estimation with Wiener Filters

    Estimation of the Room Impulse Response (RIR) of a small reverberant environment by means of a Wiener filter. The obtained filter is obtained using the Overlap-and-Add (OLA) algorithm.
    Keywords: wiener filter, matlab, convolution

Musical Acoustics Projects

  • Design of a Recorder Flute

    Design of a recorder flute in MATLAB dimensioning the bore, the last two finger holes, the flue channel and the instrument mouth.
    Keywords: applied acoustics, matlab
  • Design of a Piano

    Design and analysis of a piano soundboard and its bridge in COMSOL.
    Keywords: applied acoustics, comsol, matlab
  • Helmholtz Resonator Tree

    Modeling of a complex resonant system through a hierarchical structure of Helmholtz resonators.
    Keywords: applied acoustics, helmholtz resonator, matlab, simulink
  • Glass Harp

    3D and axysymmetric modeling of a wineglass for glass harp in COMSOL with eigenfrequencies analysis.
    Keywords: applied acoustics, comsol

Contact me