Found a large number of high-degree overlapped bigrams and high-degree biased bigrams existing in bigram feature set.
发现特征集中存在大量高度重叠特征和高度偏差特征。
In those texts, we select bigram as feature after Chinese word segmentation, deleting stop word and other process.
在筛选出的文本中,经过分词、去除停用词等处理后,选取二元词串作为特征;
应用推荐