而OCR识别导致的主要错误就是形近或者形似错误,因此,如何对该类错误进行文本自动校对是一个必须解决的问题。
Since the main errors made by an OCR system are shape-similar errors, we should find a way to proofread these errors in the recognized documents.
文本识别流的自动校对成为了亟需解决的问题。
So the automatic proofreading has being an urgent problem to solve.
该方法通过对机器分词语料和人工校对语料的学习,自动获取中文文本的分词校对规则,并应用规则对机器分词结果进行自动校对。
It discusses and analyzes the actuality of Chinese word segmentation, and describes an approach to correcting the Chinese word segmentation automatically based on rules.
应用推荐