mt-dspec-health-en-cy

English-to-Welsh translation model specialised for the health and care domain, built using Marian NMT.

Installation

pip install sentencepiece transformers

Usage

import transformers

model_id = "techiaith/mt-dspec-health-en-cy"
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
model = transformers.AutoModelForSeq2SeqLM.from_pretrained(model_id)
translate = transformers.pipeline("translation", model=model, tokenizer=tokenizer)

result = translate("The patient has a headache.")
print(result[0]["translation_text"])
# Mae gan y claf gur pen.

Training Data

  • UK Government Legislation data
  • OPUS-cy-en corpus
  • Cofnod y Cynulliad (Welsh Assembly Records)
  • Cofion Techiaith Cymru

Evaluation

Metric Score
SacreBLEU 54.16
CER 0.31
WER 0.47
CHRF 69.03

Version History

2026-02-26: Re-converted with weight tying fix. The previous version required transformers<=4.30.2 due to issue #26271. This version works with all transformers versions.

Links

License

Apache 2.0

Downloads last month
2,180
Safetensors
Model size
69.8M params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including techiaith/mt-dspec-health-en-cy