Matteo Pettenò's Portfolio

About me

I am an engineer with a strong foundation in information theory from my Bachelor’s at the University of Padova and a deep passion for audio, which led me to a Master’s in Music and Acoustic Engineering at Politecnico di Milano . I gained expertise in deep learning, signal processing, and electronics and worked professionally as a full-stack developer, software architect, and DevOps engineer. In September 2025, I will begin a Marie Skłodowska-Curie PhD with a focus on Privacy for Smart Speech Technology , jointly hosted by EURECOM and Ruhr-Universität Bochum. During my PhD, I am actively seeking research or industrial internship opportunities in related fields. Feel free to review my resume or my full CV for more details about my background and experience.

Research Activity

Conditional Diffusion as Latent Constraints for Unconditional Symbolic Music Generation Models

Code

Dataset

We explore the application of denoising diffusion processes as plug-and-play latent constraints for unconditional symbolic music generation models. Recent advances in latent diffusion models have demonstrated state-of-the-art performance in high-dimensional time-series data synthesis while providing flexible control through conditioning and guidance. However, existing methodologies primarily rely on musical context or natural language as the main modality of interacting with the generative process, which may not be ideal for expert users seeking precise fader-like manipulation of specific musical attributes. In this work, we focus on a framework leveraging a library of small conditional diffusion models operating as implicit probabilistic priors on the latents of a frozen unconditional backbone. While previous studies have explored domain-specific use cases, this work, to the best of our knowledge, is the first to demonstrate the versatility of such an approach across a diverse array of musical attributes, such as note density, pitch range, contour, and rhythm complexity. Our experiments show that diffusion-driven constraints outperform traditional attribute regularization and other latent constraints architectures, achieving significantly stronger correlations between target and generated attributes while maintaining high perceptual quality and diversity.

Keywords: symbolic music, attribute-controlled generation, diffusion models, latent constraints

On the Joint Minimization of Regularization Loss Functions in Deep Variational Bayesian Methods for Attribute-Controlled Symbolic Music Generation

Code

Dataset

This research project in Deep Learning builds on top of the variational information bottleneck (VIB) framework (a generalization of the well known VAE) and it uses a supervised multi-objective learning approach that allows to encode a particular attribute relative to the input in a designated dimension of the latent space by introducing a regularization term. This offer precise control over a specific attribute of the content to generate by manipulating latent variables unlike prompt-based generative models that often lack this ability. The novel contribution of this research consists in the introduction of an invertible transformation for the univariate distribution of the attribute to encode: the goal of this proposal is to make this distribution as close as possible to the prior distribution used choose for the latent space; moreover, being the transformation invertible, it allows to map the encoded property back to the original domain improving the model interpretability. The method has been applied in the symbolic music domain, particularly in the task of generating 4 bars melodies and the dataset created to train the models is publicly available on Zenodo (link above).

Keywords: symbolic music, attribute-controlled generation, latent space regularization, power transforms

Resolv - Machine Learning Development for Research

Code

Resolv is a system designed to streamline machine learning and deep learning workflows for research purposes. The focus is on providing a comprehensive platform that facilitates model development, data processing, and infrastructure management. The system is organized into multiple components, each handling a specific part of the machine learning workflow. The goal is to create a flexible, scalable and reusable system that supports model experimentation, improves data processing through well-structured pipelines, and automates orchestration using modern tools like Keras , Apache Beam and Apache Airflow . Resolv is built to serve as a flexible and modular system aimed at supporting research in machine learning and deep learning: the organization of the system allows for clear separation of tasks by making it easier to experiment with new ides and techniques.

Keywords: deep learning, keras, apache-beam, apache-airflow

Do Unconditional Deep Generative Models Spontaneously Learn How to Encode Human-Interpretable Musical Attributes?

Code

Article

This study is focused on investigating the presence of any correlations between the topological structure of the latent space and high-level features of the output. This project supplies a valid starting point for the systematic sampling of the pre-trained model of MusicVAE, a β-VAE by Magenta . It offers ready-to-use tools for the analysis of the 2-bar and 16-bar pre-trained model configurations and it employ Latin Hypercube Sampling to perform latent space regularization, which leads to an explicit control between the output’s characteristics and the locations of the samples within the embedding. The available tools automate sampling, output feature calculations and evaluation. The purpose of this work is to unveil the presence of eventual correlations between the topological structure of the latent space and high level features of the output, which could be relevant for future works involving the refining of post hoc conditioning in generative models.

Keywords: variational autoencoders, latent space topological structure

Fonti 4.0 - Evaluation of the performance of commercial STT and NER services

Code

Article

The FONTI 4.0 project aims at exploring the suitability of automatic transcription and information extraction technologies for making historical oral sources available. In this work, we conducted an experiment to test the performance of two commercial speech-to-text services - Google Cloud Speech-to-text and Amazon Transcribe - on digitized oral sources. We created an eight-hour corpus made of manually transcribed and annotated historical speech recordings in TEI format. The results clearly show how audio quality and disturbing elements (e.g., overlaps, foreign words, etc.) impact on the automatic transcription, showing what needs to be improved for implementing an unsupervised transcription chain.

Keywords: speech-to-text, named-entity-recognition, gcp, aws

Professional Experience

I bring years of experience in software development, architecture, and DevOps, primarily through consultancy roles within the Arsenalia Group , which includes the companies Alpenite and ccelera .

My work has focused on full-stack development, especially on the SAP Hybris Commerce platform, where I delivered B2B and CRM solutions, integrated ticketing and notification systems, and developed mobile capabilities with PWA frameworks. I also took on responsibilities in software architecture, collaborating with international teams and working in London to implement logistics and retail integrations through Mulesoft middleware, using AMQP and REST/SOAP APIs. In my role at ccelera, I continued my full-stack development on SAP Hybris Commerce, working closely with high-profile clients to deliver tailored e-commerce solutions.

Additionally, at Walit , I expanded my expertise in cloud infrastructure by designing and maintaining Google Cloud environments for deep learning projects, setting up CI/CD pipelines in Gitlab, and ensuring application security with OWASP ZAP.

Main clients: Kering Eyewear, Stella McCartney, Bonfiglioli, Cellularline, PegPerego

Personal Projects

Advent of Code

Page

Advent of Code is an annual event that takes place every December, where participants solve a series of programming puzzles. Each puzzle is released daily, starting on December 1st and continuing through December 25th. The puzzles cover a wide range of topics, from algorithms and data structures to mathematical problems and optimization challenges.

Each year, I will choose a programming language either randomly or deliberately if I wish to learn or explore a specific language. I will follow two rules: no external help and no external libraries.

Every day, I will attempt to solve the proposed challenge and publish a brief description of the solution on this website.

Additionally, I have implemented a web server that can be called from each day’s page: this server allows you to upload the input file for the current day’s challenge and returns the solution.

Keywords: python, flask, render, coding-challenge

Around the Job

Code

Web App

Around the Job (hello Daft Punk), is a web app I built to assist with my job search. Over the past few years, I collected data about companies I was interested in and decided to display them on an interactive map for easy navigation. This project reflects my belief that we should first choose a place where we’d love to live and then look for a job we’re passionate about in that area. The frontend is implemented using Vue.js and the map is handled with Leaflet , while the database is managed with Firebase .

Keywords: vue.js, firebase, firestore, leaflet

Creative Projects

Ego

Code

Ego is a project that is meant to explore the idea of human perception, in particular the idea of identity and self consciousness and the way it is distorted and biased. The project has been implemented as a webapp to make the experience available to everyone, but it can also be imagined in the context of an artistic installation: the user’s first approach would be defined by the vision of a cloud of swirling particles, while an undefined drone sound is playing. Once the user is close enough to be detected by our system the cloud will start to slowly morph into a shape resembling its face: the user is now witnessing their identity take shape. A repeating melody would be heard, whose generation is based on the user’s face, in particular on their somatic traits and their current mood. The intrigued user might now get even closer, in which case they’ll hear the music become more and more intense, in a higher pitch and faster, representing an ever-growing feedback loop of self consciousness. The final phase of Ego will show the user their face being morphed into a new shape, reminiscent of a Rorschach Test , to finally express the idea of the distortion of the self caused by the filtering of external point of view.

Keywords: three.js, glsl, svelte, mediapipe, max4live, tone.js

Pulseq

Code

Pulseq is a fractal sequencer implemented as a single-page application (SPA) and inspired by the Eurorack module Bloom by Qu-Bit Electronix . It uses a user-constructed sequence to recursively generate a tree of related subsequences that are strung together to create large musical sequences that relate back to the originally programmed sequence. First, the user programs the main sequence which from now on will (called trunk) than the branches and path knobs can be used to control the tree generation. Beside these knobs, the user has the possibility to play the sequences with eight different sounds and to apply three different pre-set effects: a reverb, a ping-pong delay and a distortion.

Keywords: svelte, tone.js, glsl

Computer Music Projects

OranJam

Code

OranJam is an audio plugin that implements a polyphonic subtractive synthesizer. The software is implemented in C++ with the aid of JUCE . The DSP chain includes several functional blocks: an oscillators bank, a white noise generator, an ADSR envelope, a filters bank and a LFO. A GUI enables the user to control the available parameters, such as waveform for each oscillator, cutoff frequency and resonance for each filter.

Keywords: juce, c++, cmake

HarMMMLonizer

Code

HarMMMLonizer is a real-time harmonizer implemented in SuperCollider . The software implements a DSP system featuring mono input and stereo output. The DSP chain includes a delay line block which supports different feedback setups. Furthermore, the GUI enables the musician to control available parameters, each specifically related to pitch shifting, delay effect and master. HarMMMLonizer supports three additional pitched voices to build the harmony, but a global variable within the code enables the programmer to change the number of voices.

Keywords: supercollider, harmonizer, delay lines, crosstalk delay feedback

Sound Analysis Synthesis & Processing Projects

Wave Digital Filter Modeling

Report

Design of a three-way crossover network in the Wave Digital (WD) domain starting from a reference analog circuit. The model is then implemented in MATLAB using the trapezoidal discretization method (bilinear transformation).
Keywords: matlab, virtual analog
Leslie Speaker Emulation

Report

Efficient implementation of the Leslie rotary speaker as a digital audio effect.
Keywords: matlab, digital audio effect
Acoustic Source Localization with Microphone Array

Report

Acoustic source localization using the Delay-And-Sum (DAS) beamformer and the MUSIC methods to estimate the Direction Of Arrival (DOA) of two audio sources sampled by a 64 microphones array.
Keywords: sound localization, doa estimation, matlab
RIR Estimation with Wiener Filters

Report

Estimation of the Room Impulse Response (RIR) of a small reverberant environment by means of a Wiener filter. The obtained filter is obtained using the Overlap-and-Add (OLA) algorithm.
Keywords: wiener filter, matlab, convolution

Musical Acoustics Projects

Design of a Recorder Flute

Report

Design of a recorder flute in MATLAB dimensioning the bore, the last two finger holes, the flue channel and the instrument mouth.
Keywords: applied acoustics, matlab
Design of a Piano

Report

Design and analysis of a piano soundboard and its bridge in COMSOL.
Keywords: applied acoustics, comsol, matlab
Helmholtz Resonator Tree

Report

Modeling of a complex resonant system through a hierarchical structure of Helmholtz resonators.
Keywords: applied acoustics, helmholtz resonator, matlab, simulink
Glass Harp

Report

3D and axysymmetric modeling of a wineglass for glass harp in COMSOL with eigenfrequencies analysis.
Keywords: applied acoustics, comsol

About me

Research Activity

Conditional Diffusion as Latent Constraints for Unconditional Symbolic Music Generation Models

On the Joint Minimization of Regularization Loss Functions in Deep Variational Bayesian Methods for Attribute-Controlled Symbolic Music Generation

Resolv - Machine Learning Development for Research

Do Unconditional Deep Generative Models Spontaneously Learn How to Encode Human-Interpretable Musical Attributes?

Fonti 4.0 - Evaluation of the performance of commercial STT and NER services

Professional Experience

Personal Projects

Advent of Code

Around the Job

Creative Projects

Ego

Pulseq

Computer Music Projects

OranJam

HarMMMLonizer

Sound Analysis Synthesis & Processing Projects

Wave Digital Filter Modeling

Leslie Speaker Emulation

Acoustic Source Localization with Microphone Array

RIR Estimation with Wiener Filters

Musical Acoustics Projects

Design of a Recorder Flute

Design of a Piano

Helmholtz Resonator Tree

Glass Harp

Contact me