Papers
arXiv:2509.15808

From Independence to Interaction: Speaker-Aware Simulation of Multi-Speaker Conversational Timing

Published on Sep 19
Authors:

Abstract

Speaker-aware simulation captures temporal consistency and realistic turn-taking dynamics in multi-speaker conversations better than baseline methods.

AI-generated summary

We present a speaker-aware approach for simulating multi-speaker conversations that captures temporal consistency and realistic turn-taking dynamics. Prior work typically models aggregate conversational statistics under an independence assumption across speakers and turns. In contrast, our method uses speaker-specific deviation distributions enforcing intra-speaker temporal consistency, while a Markov chain governs turn-taking and a fixed room impulse response preserves spatial realism. We also unify pauses and overlaps into a single gap distribution, modeled with kernel density estimation for smooth continuity. Evaluation on Switchboard using intrinsic metrics - global gap statistics, correlations between consecutive gaps, copula-based higher-order dependencies, turn-taking entropy, and gap survival functions - shows that speaker-aware simulation better aligns with real conversational patterns than the baseline method, capturing fine-grained temporal dependencies and realistic speaker alternation, while revealing open challenges in modeling long-range conversational structure.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2509.15808 in a model README.md to link it from this page.

Datasets citing this paper 2

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2509.15808 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.