面向新闻领域的中文关键词自动标引 关键词:组块,组块语料库,树库,语法分析 [gap=764]Keywords: Chunk, Chunked corpus, Treebank, Syntactic parsing
基于18个网页-相关网页
Sinica Treebank 中文句结构树资料库
UPenn Treebank 宾州大学树库 ; 宾州大学的句法树库
Penn Chinese Treebank 宾州大学中文树库 ; 宾州中文树库
Penn Discourse Treebank 树库 ; 宾州篇章树库理论
dependency treebank 依存结构树库
Chinese treebank 汉语树库
Tsinghua Chinese Treebank 树库 ; 清华汉语树库
Penn Treebank Sample 宾州树库样本集
以上来源于: WordNet
How can I train NLTK on the entire Penn Treebank corpus?
我怎样才能培养NLTK整个宾州树库语料?
It is laborious to collect the corpus with chunk tags, and thus its acquisition is mostly carried out through the transformation of the existing treebank.
同时组块库的获取和收集也是一项迫切的任务,由于不易直接获取具有组块标注的语料,当前大多组块语料库是通过转化现有树库获得。
Experiments on Chinese TreeBank from different training set size are made. It shows that our approach improves the accuracy of POS tagging over the four training sets with different sizes.
本文以宾州中文树库为实验语料,考查了不同规模的标注数据对模型性能的影响,实验结果表明,本文提出的无监督词性标注方法提高了中文词性标注的性能。
应用推荐