Papers
arxiv:1906.07662

State-of-the-Art Vietnamese Word Segmentation

Published on Jun 18, 2019
Authors:
,
,

Abstract

Word segmentation is the first step of any tasks in Vietnamese language processing. This paper reviews stateof-the-art approaches and systems for word segmentation in Vietnamese. To have an overview of all stages from building corpora to developing toolkits, we discuss building the corpus stage, approaches applied to solve the word segmentation and existing toolkits to segment words in Vietnamese sentences. In addition, this study shows clearly the motivations on building corpus and implementing machine learning techniques to improve the accuracy for Vietnamese word segmentation. According to our observation, this study also reports a few of achivements and limitations in existing Vietnamese word segmentation systems.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/1906.07662 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/1906.07662 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/1906.07662 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.