How can I train NLTK on the entire Penn Treebank corpus?
我怎样才能培养NLTK整个宾州树库语料?
It is laborious to collect the corpus with chunk tags, and thus its acquisition is mostly carried out through the transformation of the existing treebank.
同时组块库的获取和收集也是一项迫切的任务,由于不易直接获取具有组块标注的语料,当前大多组块语料库是通过转化现有树库获得。
Experiments on Chinese TreeBank from different training set size are made. It shows that our approach improves the accuracy of POS tagging over the four training sets with different sizes.
本文以宾州中文树库为实验语料,考查了不同规模的标注数据对模型性能的影响,实验结果表明,本文提出的无监督词性标注方法提高了中文词性标注的性能。
Experiments on Chinese TreeBank from different training set size are made. It shows that our approach improves the accuracy of POS tagging over the four training sets with different sizes.
本文以宾州中文树库为实验语料,考查了不同规模的标注数据对模型性能的影响,实验结果表明,本文提出的无监督词性标注方法提高了中文词性标注的性能。
应用推荐