我怎样才能培养NLTK整个宾州树库语料?
提出了一种从宾州中文语料库中自动提取词汇化树邻接文法(LTAG)的算法。
An algorithm of the extracting Lexicalized Tree Adjoining Grammar (LTAG) from Penn Chinese corpus was presented.
本文以宾州中文树库为实验语料,考查了不同规模的标注数据对模型性能的影响,实验结果表明,本文提出的无监督词性标注方法提高了中文词性标注的性能。
Experiments on Chinese TreeBank from different training set size are made. It shows that our approach improves the accuracy of POS tagging over the four training sets with different sizes.
应用推荐