JaCappella Corpus: A Japanese a Cappella Vocal Ensemble Corpus
Paper
•
2211.16028
•
Published
This model was trained by Tomohiko Nakamura using the codebase).
It was trained on the vocal ensemble separation task of the jaCappella dataset.
The paper was published in ICASSP 2023 (arXiv).
See the jaCappella dataset page.
See the jaCappella dataset page.
data:
num_workers: 12
sample_rate: 48000
samples_per_track: 13
seed: 42
seq_dur: 5.046
source_augmentations:
- gain
sources:
- vocal_percussion
- bass
- alto
- tenor
- soprano
- lead_vocal
filterbank:
kernel_size: 32
n_filters: 64
stride: 16
masknet:
bidirectional: true
chunk_size: 174
dropout: 0
ff_activation: relu
ff_hid: 256
hop_size: 128
in_chan: 64
mask_act: sigmoid
n_repeats: 8
n_src: 6
norm_type: gLN
out_chan: 64
optim:
lr: 0.005
optimizer: adam
weight_decay: 1.0e-05
training:
batch_size: 1
early_stop: true
epochs: 600
gradient_clipping: 5
half_lr: true
loss_func: pit_sisdr
| Method | Lead vocal | Soprano | Alto | Tenor | Bass | Vocal percussion |
|---|---|---|---|---|---|---|
| DPTNet | 8.9 | 8.5 | 11.9 | 14.9 | 19.7 | 21.9 |