一种基于循环神经网络的古文断句方法
A Sentence Segmentation Method for Ancient Chinese Texts Based on Recurrent Neural Network
Abstract
提出一种基于循环神经网络的古文自动断句方法。该方法采用基于GRU (gated recurrent; unit)的双向循环神经网络进行古文断句。在解码过程中, 该算法不仅利用神经网络输出的概率分布, 还进一步引入状态转移概率和长度惩罚,; 以便提高断句准确率。在大规模古籍语料上的实验结果表明, 所提方法能够取得比传统方法更高的断句F1值。 This paper proposes an automatic sentence segmentation method for; ancient Chinese texts based on recurrent neural network (RNN). A; bi-directional RNN structure with gated recurrent units (GRU) is; implemented, and state transition probability and length penalty are; employed in decoding to improve the accuracy. Experimental results show; that proposed model achieves higher F1 score than traditional methods.