English  |  正體中文  |  简体中文  |  Items with full text/Total items : 28611/40649
Visitors : 624108      Online Users : 77
RC Version 4.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Adv. Search

Please use this identifier to cite or link to this item: http://ntour.ntou.edu.tw:8080/ir/handle/987654321/52598

Title: A combination of active learning and self-learning for named entity recognition on Twitter using conditional random fields
Authors: Van Cuong Tran
Ngoc Thanh Nguyen
Hamido Fujita
Dinh Tuyen Hoang
Dosam Hwang
Contributors: 國立臺灣海洋大學:資訊工程學系
Keywords: Named entity recognition
Active learning
Tweet streams
Date: 2017
Issue Date: 2019-11-22T03:07:26Z
Publisher: Knowledge-Based Systems
Abstract: Abstract: In recent years, many applications in natural language processing (NLP) have been developed using the machine learning approach. Annotating data is an important task in applying machine learning to NLP applications. A common approach to improve the system performance is to train on a large and high-quality set of training data that is annotated by experts. Besides, active learning (AL) and self-learning can be utilized to reduce the annotation costs. The self-learning method discovers highly reliable instances based on a trained classifier, while AL queries the most informative instances based on active query algorithms. This paper proposes a method that combines AL and self-learning to reduce the labeling effort for the named entity recognition task from tweet streams by using both machine-labeled and manually-labeled data. We employ AL queries based on the diversity of the context and content of instances to select the most informative instances. The conditional random fields are also chosen as an underlying model to train a classifier for selecting highly reliable instances. The experiments using Twitter data show that the proposed method achieves good results in reducing the human labeling effort, and it can significantly improve the performance of the systems.
Relation: 132 pp.179-187
URI: http://ntour.ntou.edu.tw:8080/ir/handle/987654321/52598
Appears in Collections:[資訊工程學系] 期刊論文

Files in This Item:

File Description SizeFormat

All items in NTOUR are protected by copyright, with all rights reserved.


著作權政策宣告: 本網站之內容為國立臺灣海洋大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,請合理使用本網站之內容,以尊重著作權人之權益。
網站維護: 海大圖資處 圖書系統組
DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback