Introduction
spaCy NER model for Spanish trained with interviews in the domain of tourism related to the Way of Saint Jacques. It recognizes four types of entities: location (LOC), organizations (ORG), person (PER) and miscellaneous (MISC). It was fine-tuned using PlanTL-GOB-ES/roberta-base-bne.
| Feature | Description |
|---|---|
| Name | bne-spacy-corgale-ner-es |
| Version | 0.0.2 |
| spaCy | >=3.5.2,<3.6.0 |
| Default Pipeline | transformer, ner |
| Components | transformer, ner |
Label Scheme
View label scheme (4 labels for 1 components)
| Component | Labels |
|---|---|
ner |
LOC, MISC, ORG, PER |
Usage
You can use this model with the spaCy pipeline for NER.
import spacy
from spacy.pipeline import merge_entities
nlp = spacy.load("bne-spacy-corgale-ner-es")
nlp.add_pipe('sentencizer')
example = "Fue antes de llegar a Sigüeiro, en el Camino de Santiago. Si te metes en el Franco desde la Alameda, vas hacia la Catedral. Y allí precisamente es Santiago el patrón del pueblo."
ner_pipe = nlp(example)
print(ner_pipe.ents)
for token in merge_entities(ner_pipe):
print(token.text, token.ent_type_)
Dataset
ToDo
Model performance
| entity | precision | recall | f1 |
|---|---|---|---|
| LOC | 0.985 | 0.987 | 0.986 |
| MISC | 0.862 | 0.865 | 0.863 |
| ORG | 0.938 | 0.779 | 0.851 |
| PER | 0.921 | 0.941 | 0.931 |
| micro avg | 0.971 | 0.972 | 0.971 |
| macro avg | 0.926 | 0.893 | 0.908 |
| weighted avg | 0.971 | 0.972 | 0.971 |
- Downloads last month
- 51
Evaluation results
- NER Precisionself-reported0.972
- NER Recallself-reported0.973
- NER F Scoreself-reported0.973