Multispeaker Community Vocoder model for DiffSinger
Trained with ~95 hours of varied singing data.
The goal of our vocoder model is to provide more quality possibilities that can be brought out of DiffSinger acoustics. The vocoder can be used with any voice.
We would like to send huge thanks to the Western Diffsinger community for providing datasets for training! Without them, this model wouldn’t be possible.
Code used to train the vocoder is avaible on HiFiPLN Github.
Pretrained checkpoint for finetuning is available at the bottom of this page.
- Model Developed & Trained by Scarfmonster
- Data coordination by PixPrucer
How To Use
- Download your vocoder of choice
- Drag and drop the downloaded .oudep file onto the OpenUtau window.
- Change the
dsconfig.yaml
configuration file of your chosen voice and setvocoder:
to a proper value. For your convenienve the setting is listed in the table. Save the file, restart OpenUtau if you had it opened.
Download
Name | Download | Version | dsconfig.yaml | Notes |
---|---|---|---|---|
hifipln | 89,1 MB | -- | 1.1 (2024-02-17) | vocoder: hifipln_1.1 | sample_rate: 44100, n_mels: 128, hop_length: 512 |
ddsppln | 23.2 MB | -- | 1.0 (2024-01-29) | vocoder: ddsppln_1.0 | A DDSP-like vocoder. Not realistic, somewhat robotic, but some people like the sound. sample_rate: 44100, n_mels: 128, hop_length: 512 |
hifipln | 89,1 MB | -- | 1.0 (2024-02-03) | vocoder: hifipln_1.0 | sample_rate: 44100, n_mels: 128, hop_length: 512 |
Singing databases used for training the model
Name | Length | Languages | Contributor |
---|---|---|---|
AdoVoc Pro | 00:05:28 | Caló, Spanish | AdoVoc Pro |
A.I.chi | 01:47:52 | English, Japanese | Peeslubn |
Aida | 04:00:48 | EN, JA, DE, FR | Violin |
Albert | 01:17:14 | Polish | SzTJ |
Aleks | 00:06:10 | Polish | SzTJ |
Ameko Kero | 02:54:31 | English, Japanese | HoodyPisDed |
Ariel | 01:09:01 | Japanese | ariika |
Brent | 00:05:15 | Spanish | Beatrix |
Cantoria Dataset | 02:25:08 | Spanish | Cantoria Dataset |
Codie | 01:00:37 | Japanese | code41den |
Deshi | 01:47:45 | Japanese, Tagalog | UtaUtaUtau |
Esmuc Choir Dataset | 00:21:31 | German | Esmuc Choir Dataset |
Evelyn | 01:05:32 | English | Violin |
Filip | 01:17:49 | Polish | Rainygardens |
Geppei | 00:30:15 | Japanese, Polish, Ukrainian | vahntanabe |
Hania | 00:05:32 | Polish | SzTJ |
Hisaki | 02:42:57 | Japanese | ryutsu |
Inka | 00:39:18 | English, Japanese | postTEENIDOL |
Jalo | 00:54:53 | Polish | SzTJ |
Karasu | 00:49:55 | Japanese | rev |
Kazuo | 00:33:40 | Japanese | Felipe Souza |
Kiiro | 01:44:54 | English, Japanese | Ryouichi |
Konryuu | 01:10:55 | Japanese | PixPrucer |
Kurenai | 00:55:26 | Japanese | liure |
Leif | 01:28:40 | English, Japanese | Tigermeat |
Lem | 00:14:22 | Polish | Wik |
Liee | 00:25:01 | JA, EN, Latin | julieraptor |
Makam Acapella | 00:38:53 | Turkish | Makam Acapella |
Makku | 02:06:58 | JA, EN, ES, IT | Gianloop |
Mat | 00:35:44 | Polish | hq_png |
Matsuki Max | 01:25:32 | Japanese | Haraoo |
Mava | 01:46:33 | English, Japanese | Enzo |
Mora | 01:49:03 | English, Japanese | funhouse |
Namine Criss | 00:31:02 | Spanish, Japanese | CrissZ3R0VZ |
Nanabot | 00:29:23 | English | postTEENIDOL |
Naoky | 03:31:55 | EN, JA, KO, ZH | xuu |
—— | 03:55:02 | —— | Anonymous Contributor |
Paulina | 00:29:34 | Polish | SzTJ |
Peiton | 02:31:09 | English | NebulaMeadow |
PIX | 04:10:54 | Polish, Japanese | PixPrucer |
Otozora Rinly | 02:49:43 | Japanese | UniverStars |
Ron | 02:28:10 | EN, JA, PL, KO, ZH | Galanist |
Rose | 00:42:39 | Polish, Japanese | Kisa |
Ryszard | 02:24:16 | Polish, Japanese | Scarfmonster |
—— | 01:50:00 | —— | Anonymous Contributor |
Singing Database | 02:46:46 | Chinese, Italian | Singing Database |
Ace | 02:50:26 | English, Japanese | SpoopyAce |
Stefan | 02:49:07 | Polish, Latin | SzTJ |
Suzu | 01:42:03 | Japanese | ariika |
Taylor | 01:09:24 | English | postTEENIDOL |
Teo Vampa | 01:56:33 | Japanese | Delphic |
Tetsu | 01:13:46 | Japanese | ariika |
Tiger | 03:31:27 | EN, ES, JP, KO, ZH, PT, FR | Tigermeat |
Tomo | 00:57:00 | Spanish, Japanese | Tomo |
Vocadito | 00:13:37 | EN, FR, HAW, ES, TL, Valencian | Vocadito |
VocalSet | 08:46:18 | Vocalise | VocalSet |
Wanda | 01:11:03 | Polish | Vieri |
Wioletta | 00:32:56 | Polish | SzTJ |
Zethiel Yu | 02:19:19 | English | xiel exalt |
Zethiel Zero | 00:32:07 | English, Japanese | xiel exalt |
Total length: | 98:28:51 | ||
Used length: | 82:06:15 |
Pitch distribution
Dataset
After augmentation
Checkpoints for finetuning
Name | Download | Version | Notes |
---|---|---|---|
hifipln | 378 MB | -- | 1.0 (2024-02-03) | sample_rate: 44100, n_mels: 128, hop_length: 512 |