据说,尤其是谷歌和雅虎,在各种大数据集贡献上已经颇有建树,尤其是对训练自然语言处理模型非常有用的文本数据。
That being said, Google and Yahoo, in particular, have been pretty good about releasing various large datasets, usually textual data useful for training natural-language processing models.
我们过滤掉重复的内容、对数据进行结构化改造,变成统一的对象模型然后使用我们的自然语言处理程序SiLCC,提取关键字和打上标签。
We filter out duplicate content, structure data into a unified object model, and then use our natural language processing program, SiLCC, to extract keywords and apply them as tags.
该模型利用自然语言处理技术,在语义层次上进行查询和检索,克服了传统检索方法的不足,提高了查全率与查准率。
This model retrieves information on semantic using natural language processing technique so as to overcome the shortcoming of traditional retrieval methods and enhance efficiency.
应用推荐