Demo page for listening to rendered violin samples.
Expressive music synthesis for continuously excited instruments poses a fundamentally different problem from percussive instruments like piano, as realistic rendering depends on both note-level playing techniques and continuously varying dynamics. In music production, high-quality violin renderings are typically achieved with large sample libraries and labor-intensive MIDI programming of technique specification and continuous controllers. Bridging symbolic programming and realistic violin rendering therefore requires datasets and generative models that represent these controls explicitly. We present VIOLET, a latent-diffusion framework for controllable violin synthesis, together with CSV-TD, a new 48 kHz synthetic dataset of violin audio aligned with MIDI notes, note-level technique labels, and continuous dynamics control curves. VIOLET operates in a compressed audio latent space using a DiT backbone trained with rectified flow, and synthesizes high-fidelity audio conditioned on notes, techniques, and continuous control signals. Objective and subjective evaluations show that the proposed method produces natural and expressive violin performances with near-perfect technique adherence, accurate pitch and timing alignment, and reflects dynamics control. The system outperforms prior violin synthesis approaches and achieves audio quality comparable to commercial virtual instruments, narrowing the gap between symbolic programming and realistic acoustic rendering.
Three excerpts from the CSV-TD test set, each rendered by five systems: a commercial Virtual Instrument (VI), VIOLET (Full), VIOLET (Synth), VIOLET (w/o Cond), and ViolinDiff. Each excerpt mixes multiple playing techniques within a single MIDI file; the technique active for each note is shown in the strip below the piano roll and highlighted in sync with playback. Dynamics curve is shown at the top.
MID_FiLD_0108
MID_FiLD_1547
MID_FiLD_4356
We present both single- and multi-technique samples drawn from our collected violin etudes evaluation set. The single-technique portion includes two excerpts for each of the six evaluated playing techniques, while the multi-technique portion features three excerpts.
Each file contains a single playing technique throughout (except for trill).
Sample 1
Sample 2
Sample 1
Sample 2
Sample 1
Sample 2
Sample 1
Sample 2
Sample 1
Sample 2
Sample 1
Sample 2
Each excerpt mixes multiple playing techniques. The sheet music score is shown for reference. Three renderings are provided: a commercial Virtual Instrument (VI), VIOLET, and ViolinDiff. ViolinDiff system is not conditioned on technique and dynamics. Note that the MIDI notation for a harmonic note is one octave lower than the actual note.
Concerto No. 1 in A Minor
Gavotte from “Mignon”
La Cinquantaine