UniCalli: A Unified Diffusion Framework for Column-Level Generation and Recognition of Chinese Calligraphy
Abstract
UniCalli, a unified diffusion framework, achieves state-of-the-art performance in both column-level recognition and generation of Chinese calligraphy, addressing challenges in maintaining character structure and page-level aesthetics.
Computational replication of Chinese calligraphy remains challenging. Existing methods falter, either creating high-quality isolated characters while ignoring page-level aesthetics like ligatures and spacing, or attempting page synthesis at the expense of calligraphic correctness. We introduce UniCalli, a unified diffusion framework for column-level recognition and generation. Training both tasks jointly is deliberate: recognition constrains the generator to preserve character structure, while generation provides style and layout priors. This synergy fosters concept-level abstractions that improve both tasks, especially in limited-data regimes. We curated a dataset of over 8,000 digitized pieces, with ~4,000 densely annotated. UniCalli employs asymmetric noising and a rasterized box map for spatial priors, trained on a mix of synthetic, labeled, and unlabeled data. The model achieves state-of-the-art generative quality with superior ligature continuity and layout fidelity, alongside stronger recognition. The framework successfully extends to other ancient scripts, including Oracle bone inscriptions and Egyptian hieroglyphs. Code and data can be viewed in https://github.com/EnVision-Research/UniCalli{this URL}.
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper