RoBERTa Korean Hanja Extended (ํ‘œ์ค€๊ตญ์–ด๋Œ€์‚ฌ์ „)

์ด ๋ชจ๋ธ์€ MLM ํ•™์Šต ์ „ ๋ฒ ์ด์Šค ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. MLM ํŒŒ์ธํŠœ๋‹์ด ์™„๋ฃŒ๋œ ๋ฒ„์ „์€ hwp0725/roberta-korean-hanja-stdict-mlm์„ ์‚ฌ์šฉํ•˜์„ธ์š”.

KoichiYasuoka/roberta-large-korean-hanja ๋ชจ๋ธ์˜ vocabulary๋ฅผ ํ‘œ์ค€๊ตญ์–ด๋Œ€์‚ฌ์ „ ํ•œ์ž๋กœ ํ™•์žฅํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

๋ชจ๋ธ ์ •๋ณด

ํ•ญ๋ชฉ ๊ฐ’
๋ฒ ์ด์Šค ๋ชจ๋ธ KoichiYasuoka/roberta-large-korean-hanja
์›๋ณธ vocab size 39,255
ํ™•์žฅ vocab size 40,235
์ถ”๊ฐ€๋œ ํ•œ์ž 980์ž
ํ•œ์ž ์ถœ์ฒ˜ ํ‘œ์ค€๊ตญ์–ด๋Œ€์‚ฌ์ „

๊ด€๋ จ ๋ชจ๋ธ

๋ชจ๋ธ ์„ค๋ช…
hwp0725/roberta-korean-hanja-stdict vocab ํ™•์žฅ ๋ฒ ์ด์Šค ๋ชจ๋ธ (ํ˜„์žฌ ๋ชจ๋ธ)
hwp0725/roberta-korean-hanja-stdict-mlm MLM ํŒŒ์ธํŠœ๋‹ ์™„๋ฃŒ ๋ชจ๋ธ

์‚ฌ์šฉ๋ฒ•

from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("hwp0725/roberta-korean-hanja-stdict")
model = AutoModelForMaskedLM.from_pretrained("hwp0725/roberta-korean-hanja-stdict")

# Fill-mask ์˜ˆ์‹œ
from transformers import pipeline
fill_mask = pipeline("fill-mask", model=model, tokenizer=tokenizer)
result = fill_mask("ๅญ”ๅญๆ›ฐ๏ผšๅญธ่€Œๆ™‚็ฟ’ไน‹๏ผŒไธไบฆ[MASK]ไนŽ")
print(result)

ํ™•์žฅ๋œ ํ•œ์ž

ํ‘œ์ค€๊ตญ์–ด๋Œ€์‚ฌ์ „์— ๋“ฑ์žฌ๋œ ํ•œ์ž์–ด ํ‘œ์ œ์–ด์—์„œ ์ถ”์ถœํ•œ ํ•œ์ž ์ค‘, ๊ธฐ์กด ๋ชจ๋ธ์— ์—†๋˜ 980์ž๋ฅผ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

์ฃผ์š” ์ถ”๊ฐ€ ํ•œ์ž ์˜ˆ์‹œ:

  • ํ™•์žฅ ํ•œ์ž A (CJK Extension A): ใŒ, ใ‘Š, ใ–  ๋“ฑ
  • ํฌ๊ท€ ํ•œ์ž: ์ผ๋ถ€ ๊ณ ์ „/์ „๋ฌธ ์šฉ์–ด์—์„œ ์‚ฌ์šฉ๋˜๋Š” ํ•œ์ž

๋ผ์ด์„ ์Šค

์›๋ณธ ๋ชจ๋ธ๊ณผ ๋™์ผํ•˜๊ฒŒ CC BY-SA 4.0 ๋ผ์ด์„ ์Šค๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

์ธ์šฉ

์ด ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์‹œ๋ฉด ์›๋ณธ ๋ชจ๋ธ๋„ ํ•จ๊ป˜ ์ธ์šฉํ•ด ์ฃผ์„ธ์š”:

@misc{yasuoka2024roberta,
  author = {Koichi Yasuoka},
  title = {roberta-large-korean-hanja},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/KoichiYasuoka/roberta-large-korean-hanja}
}
Downloads last month
27
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for hwp0725/roberta-korean-hanja-stdict

Base model

klue/roberta-large
Finetuned
(4)
this model
Finetunes
1 model