arXiv:2510.22603
Umberto Cappellazzo
hisoka94
AI & ML interests
Multimodal Large Language Models and audio-visual speech processing at @ Imperial College London.
Recent Activity
authored
a paper
11 days ago
Mitigating Attention Sinks and Massive Activations in Audio-Visual
Speech Recognition with LLMS
upvoted
a
paper
11 days ago
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial
Representations
Organizations
None yet